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REMARKS 

Discussion of claim amendments. 

Applicants have amended the claims as follows. 

Independent claim 30 has been amended to incorporate the subject matter of claim 34, 
directed to where the mutant alga is "of Chlamydomonas spp.", and claim 34 has been canceled. 

Also, claims 31 to 33, and 35 have been canceled. 

Claim 36 has been amended to depend from claim 30, instead of now canceled claim 35. 
Claim 37 has been canceled. 

Independent claim 38 has been amended to recite that the mutant alga is "of 
Chlamydomonas spp.", which is one item of the Markush group recited in claim 42, and claim 42 
has been canceled. 

Also, claims 39 to 41, and 43 have been canceled. 

Claim 44 has been amended to depend from claim 38, instead of now canceled claim 43. 

No new matter has been added by any of the amendments to the claims, and thus, the 
Examiner is respectfully requested to enter the amendments to the claims. 



Overview. 

Applicants respectfully below present arguments in support of independent claims 30 and 
38 directed to mutant alga of Chlamydomonas spp. (i.e., to species of alga in the genus 
Chlamydomonas) without restricting these claims 30 and 38 to the species Chlamydomonas 
reinhardtii, and more particularly without restricting to the Stm6 strain, as independent claim 28 
is already restricted. 



Claim Objection to claim 37. 

The Examiner objected to claim 37 as a substantial duplicate of claim 28. In view of this 
objection, applicants have canceled claim 37. 

Hence, the Examiner is respectfully requested to withdraw the objection to claim 37. 
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Claim Rejections. 

Rejection of Claims 30 and 38 under 35 USC Section 112, second paragraph. 

The Examiner rejected claim 30, and asserted that the term "illuminated conditions" is 
confusing and vague on the grounds that it is unclear what illuminated conditions means. 

Applicants respectfully draw the attention of the Examiner to the fact that the term is 
defined in the specification at page 27, lines 12 to 15. As the Examiner will appreciate from this 
definition, claim 30 defines a mutant alga of Chlamydomonas species which is capable of 
hydrogen production when there is sufficient light available to it for photosynthesis to take place. 
Representative indications of the light intensity are given at page 27, lines 1 5 to 20 of the 
specification, but as the person ordinarily skilled in the art would appreciate, photosynthesis can 
take place at very low to very high Ught intensities. Hence, for all intents and purposes, the 
effect of the phrase "illuminated conditions" is to indicate that the organism is capable of 
hydrogen production in the light, as opposed to when it is kept deliberately in a darkened 
envirormient. 

The Examiner rejected claims 30 and 38, and asserted that the recitation "HydA" is 
confusing as it is unclear whether it refers to a specific hydrogenase or not. 

Applicants respectfiilly confirm that HydA is the name of a specific hydrogenase as 
would be understood by the person ordinarily skilled in the art. 

Applicants respectfully draw the attention of the Examiner to the fact that the process of 
hydrogen production in the green alga Chlamydomonas reinhardtii is described in the 
specification fi-om line 12 of page 3 to line 5 of page 4, fi-om which it is apparent that vmder 
illuminated, anaerobic conditions the hydrogenase, HydA, located in the chloroplast stroma, 
catalyses the conversion of electrons and protons to hydrogen gas which is released from the cell 
while ATP is generated in the chloroplast. This process also is illustrated in Figure 1 and Figure 
2 and the role of HydA is clearly shown. A redox-controlled regulation mechanism operates 
under transient light conditions to switch from linear to cyclic photosynthetic electron transport 
under appropriate conditions. However, tiie enzyme HydA is extremely sensitive to inhibition by 
oxygen, as noted by Melis et al. (2000), cited and described in the specification at page 4, lines 7- 
15, and therefore, efforts have been devoted to temporal separation of oxygen generation from 
tiie oxygen-sensitive hydrogen production process catalyzed by the chloroplast hydrogenase, 
HydA. 

Further, as noted by Florin et al. (2001), cited and described in the specification at page 3, 
lines 12-24, many organisms have an enzyme capable of catalyzing the reversible reduction of 
protons to molecular hydrogen. There are several phylogenetically distinct groups of 
hydrogenase enzymes: nickel iron hydrogenases, iron hydrogenases and metal-free types. The 
iron hydrogenases have been found in hydrogen-producing anaerobic bacteria and protozoa, and 
more recently in green algae such as C. reinhardtii and S. obliquus. In hydrogenase 
nomenclature, the term "Hyd" is proposed to be reserved for these enzymes, with the 



TRII\662962vl 



5 



Attorney Docket No.: 012930-000026 
Amendment responsive to August 23, 2007 Office Action 
US ApplnNo. 10/562,512 

terminal letter ("HydA") distinguishing between enzymes where necessary. A second 
category of iron hydrogenase is composed of mostly oligomeric enzymes that interact with 
NAD(P) and contain domain homologous to the NuoE and NuoF subunits of complex I, and the 
suggested nomenclature for the catalytic subunits is HndA. The proposal for nickel ion 
hydrogenases involves the use of HynSL, HupSL, and so on. So the gene HydA encodes a 
protein which catalyses the reversible reduction of protons to molecular hydrogen and is either a 
monomeric iron hydrogenase or the large subunit of a dimeric iron hydrogenase (the small 
subunit is referred to as HydB). 

There is no reason to suppose that the HydA gene differs greatly in structure across all of 
the species in the genus Chlamydomonas although, as there always is, there will be some degree 
of sequence difference across species within the genus. This is well understood by the person 
ordinarily skilled in the art, and applicants respectfully submit that, based on the claims as 
amended above, the Examiner cannot make a prima facie case that this is not so, and thus, cannot 
properly present a rejection of the claims as amended above. 

To return to the point made by the Examiner, applicants respectfully point out that the 
term "a hydrogenase" clearly refers to a group of different enzymes as discussed above. Even 
the subset of this group, the iron hydrogenases, is a group of enzymes. While terminology has 
not always been used consistently, it is clear that there is standardized nomenclature now in place 
in which the genes encoding iron hydrogenases are the "Hyd" genes and HydA is the name of 
one member of the group. HydA will nevertheless have variations in sequence across species, as 
all genes do, as is well understood by the person ordinarily skilled in the art. Therefore, while 
the HydA in C. reinhardtii might differ slightly in sequence from the HydA gene in other 
hydrogen-producing Chlamydomonas species, it is, without doubt, the same gene in 
phyllogenetic and functional terms. 

The Examiner rejected claims 30 and 38, and asserted that the recitation "Mod" is 
confusing as it is unclear whether "Mod" is from a specific organism or not. With regard to the 
objection to Mod, applicants respectfully submit that there is substantial description of the 
Mod gene in the specification. Mod encodes a transcription factor homologous to human 
mTERF as discussed at page 37, lines 5 to 9 of the specification. Mod is discussed further at 
page 41 starting from line 6 of the specification. It is a nuclear-encoded, mitochondrial DNA- 
binding protein, and deletion of the activity of Mod results in de-regulation of the mitochondrial 
electron transport pathway, as discussed in the specification at page 41, line 6 to line 24. The 
result is inhibition of photosynthetic cyclic electron flow, and with more electrons available, 
there is increased hydrogen production. A comparative sequence analysis of the Mod gene, 
which was first identified in Chlamydomonas reinhardtii by the present inventors shows that 
there are striking similarities to the human mTERF protein - - the sequence alignment is given in 
Figure 8. Mod also has homologues in Drosophila melanogaster and sea urchin, as discussed at 
page 35, lines 31-34 of the specification. Additionally, nine homologues to Mod with mTERF 
domains have been identified in the genome of Arabidopsis thaliana. 

There is a paucity of sequence data available for algae. To applicants' knowledge, the 
only other green alga for which sequence data is available is Vovox Carteri (which, like 
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Chlamydomonas belongs to the Order Volvocales). More particularly, applicant conducted an 
internet search using the well known BLAST (Basic Local Alignment Search Tool). The 
BLAST searching demonstrated that a homolog of Mod exists on scaffold 37 of JGI Volvox 
database. 

Accordingly, the Examiner is respectfully requested to withdraw the rejection of claims 
30 and 38 under 35 USC Section 1 12, second paragraph. 



Rejection of Claims 30-36 and 38-44 under 35 USC Section 112, first paragraph. 

The Examiner rejected claims 30 to 36 and 38 to 44 on the basis that the claims are 
directed to "any mutant alga from any source expressing any HydA hydrogenase having a 
mutation that results in reduced activity of any mitochondrial transcription factor comprising any 
Mod". 

Applicants respectfully note that the specific issues concerning the assertion that "any 
HydA" of "any Mod" is employed have been addressed above, in the response to the rejection 
under 35 USC Section 112, second paragraph. 

Additionally, applicants respectfully submit that the amendment above to each of 
independent claims 30 and 38 restricts these claims and also dependent claims 36 and 44, 
respectfiilly dependent back to independent claims 30 and 38, to algae species within the genus 
Chlamydomonas. 

The genus Chlamydomonas is not a large and variable group, but a tightly linked 
phylogenetic clade in which at least the species Chlamydomonas applanata, Chlamydomonas 

chlamydogama, Chlamydomonas debaryana, Chlamydomonas dorsoventralis, Chlamydomonas 
elliptica, Chlamydomonas eugametos, Chlamydomonas hindakii, Chlamydomonas hydra, 
Chlamydomonas moewusii, Chlamydomonas reinhardtii and Chlamydomonas texensis are 
known to produce hydrogen under anaerobic conditions. Therefore, the amended claims do not 
claim any mutant alga from any source, but the members of a tightly linked phylogenetic clade in 
which the expectation wovild be that hydrogen production would occur by the same mechanism 
using the same enzyme, HydA, under the same control mechanisms. 

Applicants respectfully further note that Chlamydomonas reinhardtii is a useful 
experimental model in the way that Saccharomyces cerevisiae and Arabidopsis thaliana are 
powerful models for dissecting basic biological processes in yeast and plants respectively. The 
first draft of the Chlamydomonas nuclear genome sequence has been made available. See 
attached, Dent et al., "Functional Genomics of Eukaryotic Photosyntheses Using Insertional 
Mutagenesis of Chlamydomonas reinhardtii", vol. 137, Plant Physiology (February, 2005), pp. 
545-556. Many tools have been developed to allow for manipvdation of C. reinhardtii. The 
generation of tagged insertional mutations by nuclear transformation has allowed, for example, 
the studies of oxygenic photosynthesis in Eukaryotes, as photosynthesis in Chlamydomonas is 
very similar to that of land plants. Plasmid, cosmid and bacterial artificial chromosome (BAG) 
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libraries are used to rescue nuclear mutations, and expression of specific genes can be repressed 
using both antisense and RNA interference technologies. See attached, Grossman et al., 
''Chlamydomonas reinhardtii at the Crossroad of Genomics", vol. 2, no. 6, MINIREVIEW, 
Eukaryotic Cell (December, 2003), pp. 1 1 37- 11 50. 

Additionally, there is confirmation in Grossman et al. that gene disruption is routine once 
sequence information is available, and further that completion of sequence information permits 
targeted generation of mutations (see, pages 1 142 and 1 143 of Grossman et al). Thus, there is 
confirmation in a publication made after the priority date of the routine nature of the relevant 
techniques once (a) a discovery concerning the utility of a gene is made and (b) it is identified 
and characterized so as to have sequence information available. The publication nevertheless 
contains no teaching or suggestion of any specific finding surrounding Mod or hydrogen 
production. 

The present inventors produced the Stm6 strain. The process involved a random insertion 
of the plasmid pArg7.8, carrying the Arg7 gene, into the genome of the Arginine auxotrophic 
strain, CC 168 followed by identification of potential state transition mutants. Stm6 was 
identified and found to be blocked in state 1 due to insertion of the pArg7.8 plasmid in the Mod 
gene. There was an additional insertion in a nuclear transposon (Tocl) as discussed at page 34, 
lines 20-28 of the specification. PGR analysis of Stm6 and the wild type resulted in the 
amplification of a 1005 bp PGR product in Stm6. This confirmed that the insertion caused the 
deletion of only part of Mod, and the remaining 512 base pairs of Mod remain. Figure 8 gives 
the protein sequence of Mod and an alignment with the human transcription termination factor 
mTERF. With this sequence information, the person ordinarily skilled in the art may employ the 
tools that exist for Chlamydomonas reinhardtii to knock out expression of the gene, without 
undue experimentation. For example, antisense and RNAi techniques may be employed to 
silence the gene. Alternatively, such site-specific mutagenesis could be used to introduce activity 
destroying mutations and/or an antibody to Mod generated, as would be well understood by the 
person ordinarily skilled in the art. 

Further, applicants respectfiilly submit that it is the reduction of Mod activity in the 
Stm6 mutant from which it derives its ability to produce greater quantities of hydrogen. The 

hydrogenase HydA is naturally present in the organism and therefore there is no reason that the 
specification should describe how to make algae expressing HydA as the Examiner appears to 
require. The hydrogenase is not manipulated or altered in the present invention. Rather, the 
Mod knockout or knock down induces changes in the organism which increase linear electron 
transport to HydA and reduce cyclic electron transport as discussed, for example, at page 41, 
lines 6 to 24 of the specification. The reduction or elimination of Mod activity by any means 
will achieve this end, but it is not true, as the Examiner asserts, that this is not reasonably 
predictable on the face of the specification because tiie specification does not establish the 
structure of Mod. In fact, sequence information for the Mod gene is provided in Figure 8 and 
in the sequence listing. The person skilled in the art, using the tool kit available for 
Chlamydomonas reinhardtii and techniques known to the person skilled in the art, with this 
information, could modify the organism, without undue experimentation. 
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Accordingly, the Examiner is respectfully requested to withdraw the rejection of claims 
30-36 and 38-44 under 35 USC Section 1 12, first paragraph. 



CONCLUSIONS 

In view of the above amendments and remarks, applicants respectfully request the 
Examiner to withdraw the objection to claim 37, the rejection of claims 30 and 38 under 35 USC 
Section 1 12, second paragraph, and the rejection of claims 30-36 and 38-44 under 35 USC 
Section 112, first paragraph. 

Allowance is earnestly solicited. If the Examiner should have any questions, he is 
respectfully requested to telephone the undersigned to resolve any such issues, and obviate the 
issuance of another Office Action. 

DEPOSIT ACCOUNT 

Although it is believed that no fee is due, the Commissioner is authorized to charge any 
deficiencies of payment associated with this Commimication, or to credit any overpayment, to 
Deposit Account No. 13-4365. 

Respectfully submitted, 
MOORE & VAN ALLEN PLLC 



Date: November 21. 2007 




Jennifer L. Skord 

Telephone: (919) 286-8000 Registration No. 30,687 

Facsimile: (919) 286-8199 Moore & Van Allen PLLC 

430 Davis Drive, Suite 500 
Morrisville, NC 27560-6832 

Ends.- 

Dent et al., "Functional Genomics of Eukaryotic Photosyntheses Using Insertional Mutagenesis 
of Chlamydomonas reinhardtif, vol. 137, Plant Physiology (February, 2005), pp. 545-556 



Grossman et al, "Chlamydomonas reinhardtii at the Crossroad of Genomics", vol. 2, no. 6, 
MINIREVIEW, Eukaryotic Cell (December, 2003), pp. 1 137-11 50 
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Functional Genomics of Eukaryotic Photosynthesis Using 
Insertional Mutagenesis of Chlamydomonas reinhardtii^ 

Rachel M. Dent, Cat M. Haglund, Brian L. Chin^ Marilyn C. Kobayashi, and Krishna K. Niyogi* 
Department of Plant and Microbial Biology, University of California, Berkeley, California 94720-3102 

The unicellular green alga Chlnmydomoms reinlwrdtii is a widely used model organism for studies of oxygenic pliotosynthesis 
in eukaryotes. Here we describe the development of a resource for functional genomics of photosynthesis using insertional 
mutagenesis of the Chlamydomonas nuclear genome. Chlamydomonas cells were transformed with either of two plasmids 
conferring zeocin resistance, and insertional mutants were selected in the dark on acetale-containing medium to recover light- 
sensitive and nonphotosynthetic mutants. The population of insertional mutants was subjected to a battery of primary and 
secondary phenotypic screens to identify photosynthesis-related mutants that were pigment deficient, light sensitive, 
nonphotosynthetic, or hypersensitive to reactive oxygen species. Approximately 9% of the insertional mutaiits exhibited 1 
or more of these phenotypes. Molecular analysis showed that each mutant line contains an average of 1 .4 insertions, and 
genetic analysis indicated that approximately 50% of the mutations are tagged by the transforming DNA. Flanking DNA was 
isolated from the mutants, and sequence data for tlie insertion sites in 50 mutants are presented and discussed. 



As with other model organisms, the availability of 
genome sequence data is revolutionizing and revital- 
izing research into the biology of the unicellular green 
alga Chlamydomoms reinhardtii (Grossman et al., 2003; 
Ledford et al., 2005). Over the past four decades, many 
fundamental insights into the structure, function, 
assembly, and regulation of the photosynthetic appa- 
ratus have come from studies of Chlamydomonas, 
which offer several advantages for the genetic dissec- 
tion of eukaryotic photosynthesis (for review, see 
Davies and Grossman, 1998; Hippler et al, 1998; 
Grossman, 2000; Dent et al., 2001; Rochaix, 2001). First 
and foremost, photosynthesis is fully dispensable in 
Chlamydomonas, as cells can grow heterotrophically 
in the dark using acetate as a sole carbon source. Cells 
grown in the dark, however, still synthesize and 
assemble a fully functional photosynthetic apparatus. 
This allows the isolation and analysis of mutants that 
are unable to perform photosynthesis, and liglit- 
sensitive mutants can be maintained in complete 
darkiiess. Because Chlamydomonas is predominantly 
maintained in a haploid form, it is not necessary to 
generate homozygous nuclear mutants, and mutants 
affecting photosyntliesis can be screened immediately 
following mutagenesis. Chlamydomonas has an easily 
controlled and rapid sexual cycle (approximately 
2 weeks) with the possibility of tetrad analysis, which 
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facilitates genetic analysis. Its rapid cell-doubling time 
(approximately 10 h) and microbial lifestyle mean that 
it is easy to grow homogeneous cultures on any scale, 
simphfying physiological and biochemical character- 
ization in comparison to multicellular land plants 
(Ledford et al„ 2005). By way of example, the appli- 
cation of inhibitors and generators of various types of 
reactive oxygen species results in uniform uptake of 
the chemical by each cell. In land plants, mulhcellu- 
larity leads to differential uptake of exogenous sub- 
stances based upon the distance from or method of 
application, and different tissue and cell types may 
react differently to any given chemical, making anal- 
ysis of results difficult. 

In spite of these differences, however, the photosyn- 
thetic apparatus of Chlamydomonas is very similar to 
that of land plants, making it a useful comparative 
system for understanding plant metabolism and pho- 
tosynthesis (Gutman and Niyogi, 2004). As a member 
of the division Chlorophyta, Chlamydomonas is also 
a useful model for investigating evolutionary relation- 
ships among the green algae and thus the origins of 
photosynthesis in land plants. 

The first draft of the Chlamydomonas nuclear 
genome sequence was released in January, 2003 
(Grossman et al., 2003), and a complete, fully anno- 
tated version is expected in the near future. The recent 
accumulation of expressed sequence tag (EST) se- 
quence data (Asamizu et al., 1999; Shrager et al., 
2003) has both facilitated annotation and given some 
indication of the degree of accuracy that can be 
achieved when using bioinformatic tools to predict 
gene structure from assembled sequence data in this 
organism. The completion of the genome sequences of 
Volvox carteri and Ostreococcus tauri will also aid in this 
endeavor to identify the complete gene set of Chla- 
mydomonas. 
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Now that the sequencing phase of Chlamydomonas 
genomics is nearing completion, the next step is the 
functional characterization of the genes. Sequence 
comparison and phylogenetic approachevS can be used 
to identify putative functional homologs of genes 
whose functions are known in other organisms, but 
mutagenesis is one of the most powerful methods for 
assigning function to a given gene or gene family. In 
Chlamydomonas, insertional mutagenesis has proved 
to be a very useful tool in forward genetics studies, 
which aim to identify genes involved in a given process. 
Integration of exogenous DNA into the nuclear genome 
of Chlamydomonas occurs predominantly by nonho- 
mologous recombination, thus leading to random gene 
disruption (Tam and Lefebvre, 1993). In most cases, 
insertional mutagenesis creates null mutations. In 
comparison to point mutations, insertional mutagene- 
sis allows the isolation of sequence flanking the muta- 
tion by methods such as plasm id rescue and PCR-based 
techniques. Although the recent development of a de- 
tailed molecular map (Kathir et al, 2003) has made the 
mapping of point mutations relatively rapid in Chla- 
mydomonas, this is still not a viable alternative for 
high- throughput analysis of large numbers of mutants. 

Although insertional mutagenesis has been used 
extensively in the investigation of many areas of 
Chlamydomonas biology, only one study has described 
the use of the technique at a genomics level. Pazour and 
Witman (2000) reported the use of a genomic approach, 
involving both forward and reverse genetics, to isolate 
mutations affecting the outer dynein arm of Chlamy- 
domonas flagella. This structure consists of a total of 15 
proteins, thus giving some indication of the number of 
target genes that were involved. Mutations in the outer 
dynein arm result in a characteristic slow, jerky, swim- 
ming phenotype. After screening 15,000 insertional 
mutants for this phenotype, mutations in 7 of the 15 
target genes were identified. 

The paucity of studies at the genome level illustrates 
the need for more extensive functional genomic anal- 
yses and resources for Chlamydomonas to comple- 
ment the already considerable sequence information 
that is available. The generation of large mutant 
collections has been vital in the development and use 
of other model plant systems such as Arabidopsis 
{Arabidopsis thaliana; Krysan et al., 1999; Tissier et al., 
1999; Parinov and Sundaresan, 2000; McElver et al., 
2001; Sessions el al., 2002; Alonso et al, 2003), rice 
{Oryza sativa; Jeon et al., 2000; Chen et al, 2003; 
Kolesnik et al, 2004; Sallaud et al., 2004), and maize 
(Zcfl mays; Raizada et al, 2001; May et al, 2003). 
Therefore, we have initiated a large-scale forward 
genetics project using insertional mutagenesis that 
aims to saturate the Chlamydomonas nuclear genome 
for mutations affecting photosynthesis as part of the 
Chlamydomonas Genome Project (Grossman et al, 
2003). In this article, we describe the mutant genera- 
tion and screening methods being employed in this 
project, As a resource to workers in the field who will 
be using these mutants, the phenotypic, molecular. 



and genetic characteristics of a subset of mutants are 
reported here, in addition to flanking sequence data. 
The whole population of phenotypically characterized 
mutants and a searchable sequence database will be 
available to the scientific community as they are ge- 
nerated over Uie next several years. 

RESULTS 

Generation of Insertional Mutants 

To isolate insertional mutants affecting all aspects of 
photosynthesis in Chlamydomonas, selection of trans- 
formed cells in the dark was necessary. Although 
mutants incapable of photoautotrophic growth can 
be isolated and maintained as acetate-requiring mu- 
tants in the light, this approach does not allow the 
recovery of all photosynthetic mutants (Spreitzer and 
Mets, 1981). Very few mutants with defects in the COj 
fixation reactions of photosynthesis, for example, can 
be recovered this way, because the mutants are light 
sensitive. 

After comparison of the growth of several wild-type 
Chlamydomonas strains in the dark, the strain 4A+ in 
the 137c genetic background was selected as the paren- 
tal strain for the population of insertional mutants 
based on its ability to grow well and remain green in the 
dark. Cells were transformed with either of 2 linearized 
plasmids, pSP124S or pMS188 (Fig. 1), containing the 
hk gene, which confers resistance to the antibiotic 
zeocin (bleomycin), and transformants were selected 
on acetate-containing medium in the dark. Transfor- 
mation efficiencies using the 4A+ strain were 86.5 
transformants//u.g DNA for pSP124S and 115.5 trans- 
formants/;ag DNA for pMSlSS. Both of these efficien- 
cies are lower than those reported for these plasmids in 
other studies (Lumbreras et al, 1998; Schroda et al, 
2002), suggesting that 4A+ may transform at lower 
efficiencies than cell wall-deficient strains and other 
strains that were used previously. Here we report data 
for a total of 2,000 insertional mutants generated using 
pSP124S and 760 using pMSlSB. 
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Figure 1. Diagram of linearized plasmids used for insertional muta- 
genesis. Relevant restriction enzyme sites are shown. Arrows indicate 
the approximate positions of specific primers used for TAIL-PCR. 
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Primary transformants picked 



1.5 mlVI Metronidazole 2 |jM Rose Bengal 
Low light (LL) Low light (LL) 

Complete medium Complete medium 



1) VLL, complete medium 

2) VLL, minimal medium 

3) LL, complete medium 

4) LL, minimal medium 

5) LL, minimal medium + high CO2 

6) HL, minimal medium 

7) HL, minimal medium + high CO2 



8) 1.5mlVIIMZ, LL, 11)2|jMRB, LL, 
complete medium complete medium 

9) 1 .5 mM MZ, LL, 1 2) 1 plW RB, LL, 
minimal medium minimal medium 

10) 1.5mlVIMZ, VLL, 13)2mMRB, VLL, 
complete medium complete medium 
14) Chlorophyll fluorescence 



Results of Phenolypic Screening 

Insertional mutants were subjected to primary and 
secondary rounds of phenotypic screening (Fig. 2). The 
primary screens included incubation of the mutants at 
high light (HL; 500 /ainol photons m"^ s"') on minimal 
medium to isolate all light-sensitive or nonphotosyn- 
thetic clones. Two generators of reactive oxygen spe- 
cies were used to isolate mutants that are sensitive to 
photooxidative stress, which often accompanies pho- 
tosynthesis. Like chlorophyll, Rose Bengal (RB) gen- 
erates singlet, oxygen in the presence of light. By 
growing cells on medium containing RB, elevated 
levels of singlet oxygen would be present within cells 
and in the surrounding medium. Metronidazole (MZ), 
however, acts by accepting electrons from reduced 
ferredoxin and catalyzing superoxide formation in 
the chloroplast compartment of Clilamydomonas 
(Schmidt et al, 1977). The secondary screening meth- 
ods were designed to characterize the phenotype of 
primary mutants more fully by assessing the degree of 
light sensitivity (at various light intensities) and ascer- 
taining whether the response to generators of reactive 
oxygen species was dependent on photoautotrophic or 
heterotrophic growth conditions (Fig. 2). 

The proportions of mutants in each major pheno- 
typic class are presented in Table I. The total pro- 
portion of mutants showing a phenotype in any of the 
screens was 8.8%. It should be noted that the classes of 



mutants presented in Table I are not mutually exclu- 
sive, and thus mutants may show a phenotype in more 
than one of the test screens. 

The largest class of mutants recovered was the 
acetate-requiring mutants. In agreement with Spreitzer 
and Mets (1981), most of these also exhibited some sen- 
sitivity to light, either at the low-light (LL; 80 ,amol 
photons m"^ s"') or HL level. Secondary screen- 
ing showed that 18% of the acetate-requiring mutants 
could be rescued, at least partially, under conditions of 
high COj. The LL-, HL-, RB-, and MZ-sensitive classes 
all occurred at a frequency of approximately 2,3%. Of 
the total number of mxitants found to be sensitive to 
either generator of reactive oxygen species, only one- 
third showed sensitivity to both RB and MZ. The 
smallest mutant class comprised the pigment-deficient 
mutants, and these occurred at a frequency of 0.6%. 
This class included mutants that were pale green in all 



Table I, Percentage of mutants in each phenotypic class 



Plienolype PercenUei- 



LL sensitive (&80 j«.mol photons m"^ s"') 


2.3 


Acetate requiring 


3.8 


HL sensitive (^500 iimo\ photons m"^ s"' 


2.3 


RB sensitive 


2.3 


MZ sensitive 


2.3 


Pigment deficient 


0.6 
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Molecular Analysis of Transformants 

To characterize the average number of ble insertion 
loci in each mutant, DNA gel-blot analysis was carried 
out on those mutants that exhibited a phenotype in 
any of the screens. For the population generated using 
the pSP124S plasmid, 85 mutants were analyzed, and 
30 were analyzed for which pMS188 was the trans- 
forming plasmid. Figure 3 shows examples of the 
DNA gel-blot analysis. It was found that, for both 
plasmids, approximately 70% of the transformants 
contained a single ble insertion locus (61/85 for 
pSP124S and 22/29 for pMSlBB). The average number 
of ble insertion loci for pSP124S was 1.4 and for 
pMS188 it was 1.3. It should be noted that this analysis 
would not be able to identify clones in which multiple 
ble insertions had occurred at one locus, 

In addition to probing for the sequence encoding the 
ble gene, 53 of the mutants were also analyzed for the 
presence of the origin of replication from the pBlue- 
script portion of the transforming plasmid. Thirty-one 
of the 53 mutants (58.5%) were found to have 1 or more 
bands hybridizing to this sequence. Of these 31, 



however, 23 (74%) contained bands of the same size 
that hybridized to both the ble probe and the origin of 
replication probe. Because the genomic DNA was 
digested with NcoT, which should cut between these 
sequences in the linearized transforming plasmid (Fig. 
1), bands of different sizes should be detected with the 
two probes. This suggests that the clones in which the 
same-size fragment was detected ail contained tandem 
head-to-tail insertions at tlie same locus. 



.1 .23456 7 



Figure 3. DNA gel-blot analysis of inserlional mutants. Arrows indicate 
bands corresponding to endogenous RBCS2 sequences, and asterisks 
indicate mutants containing multiple ble insertions. Size standards are 
shown to the left. A, Mutants generated using pSPI 24S. Genomic DNA 
was digested with Ncol, and the probe was a XbaUBamHt fragment 
from pSP124S. B, Mutants generated using pMS188. Genomic DNA 
was digested with Nhel, and the probe was a Nh^lKprA fragment from 
pMS188. 



Isolation and Sequencing of Flanking DNA 

After secondary screening, DNA was extracted from 
all mutants that rescreened with the same phenotype 
as recorded in the primary screen. Flanking DNA was 
amplified from each insertional mutant line using 
thermal asymmetric interlaced (lAIL)-PCR (Liu et al., 
1995). At least 1 DNA band was amplified in 77% of 
mutants where pSP124S was used as the transforming 
plasmid. Figure 4 shows a representative agarose gel 
analysis of fragments amplified from a subset of 
insertional mutants. The size of bands amplified using 
this technique ranged from <100 to 2,000 bp, with 
most bands being in the 100- to 1,000-bp range. Single 
bands were amplified in 44% of the mutants tested. 

One of the problems encountered with the TAIL- 
PCR technique is that some of the insertion lines 
contained concatameric insertion events at a single 
locus. As insertion events included tandem arrays of 
the transforming DNA, sequencing of the product 
from TAIL-PCR only yielded plasmid sequence. For 
pSP124S, many of these mutants could easily be 
recognized by a diagnostic band of 750 bp, and several 
other DNA fragments also yielded only plasmid 
sequence. Overall, in 15.3% of the mutants from which 
a TAIL-PCR product was amplified, it was not possible 
to obtain the flanking DNA sequence due to concata- 
merization at the site of insertion. 

Table II presents the flanking sequence results for the 
fragments generated by TAIL-PCR from 50 mutants. 
Sequences were compared to the Chlamydomonas 
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CAL005.01.20 
CALOOS.01.21 
CAL005.01,26 
CAL007.01 .01 



CAL0O7.01.25 
CAL007.01.26 
CAL007.01.29 



CAL007.02.03 
CAL007.02.09 



RB sensitive 

LL sensitive, acetate requiring 
LL sensitive, acetate requiring 
Bleaches at LL, RB, and MZ 

sensitive, low chlorophyll 

fluorescence 

RB sensitive 



Ycllovf in the dark and at VLL 



Bleaches on MS, MZ sensitive 



Bleaches on MZ/HS 
Slight RB sensilivily 

LL sensitive, acetate requiring, 

high chlorophyll fluorescence 
Slight KB sensitivity 
Acetate requiring, MZ sensitive 
Slight acetate requirement 

HL sensitive, acetate requiring, 

MZ and R6 sensitive 
RB sensitive 

RB sensitive, slight MZ sensitivity 



Slight RB sensitivity 

LL sensitive, acetate requiring, 
partially rescued by high-COj, 
low-chlorophyll content 

LL sensitive, acetate requiring 

RB se 



e, reduced pigment 



Acetate requiring, HL sensitive, 
rescued by high COj, 
RB sensitive, high chlorophyll 
fluorescence 



Genome Position and Candidate Genets)* 

IBS: 23950-23500 
248! 49881-50798 

Cenle 248.7 RBCS1 (Chlamydomonas) 
Genie 246.8 KBCS2 (Chlamydomonas) 
1725! 1 3500-1 38B9 
2187! 552-627 
248: 1404S-1396S 
86: 72908-73019 

Genie 86.13: p-7 subunif of 20S proleasome (rice) 
Genie 86.14: Histone-blnding protein 

N1/N2 {Xenopus laevis) 
1243:2407-2199 

Cenle 1243.1 and 1243,2: Trans-splicing factor 

Raa3 (Chlamydomonas) 
137: 48662-48363 

Genewise 137.14.1: crfH; carotene isomerase 

(Synec/jocysos sp. PCC6803) 
125: 47270-46956 

Genewise 125.40.1: hemD, uroporphyrin lll-synthase 

iSynechocyslis sp. PCC6803) 
45: 114609-114355 
876: 16393-16434 

Genie 876.2: Acetyl CoA synthetase (Arabldopsis) 
No genome similarity 

1543: 7155-6541 

416:20998-20655 

No genome similarity 

Identity to EST: 1031030D08.y1 

Multiple hits, i^peat region 

387; 43424-43184 

785: 20041-19556 

Genie 7BS.4; HSPlOl (Arabldopsis) 

239: 60731-60473 

Genie 239.1.1 Histone H2A (Chlamydomonas) 
Genie 239.37.1 Histone H3 iVolvox carter!) 
Genie 239.7.1 Histone H4 (Chlamydomonas) 
119: 81404-81721 
595: 18007-17767 



1152: 5548-5360 

Genie 1152.2 ATP-synthase 5-chain (Chlamydomonas) 
91: 50935-50800 

Genie 91.10: ODAl outer dynein arm docking protein 

(Chlamydomonas) 
Genie 91.11 Silencing-related Ser-Thr kinase (rice) 
563: 15107-14707 

Genie 563.3 Digalactosyldiacylglycerol synthase 

(Arabldopsis) 
1535: 12114-12240 
851:9886-9587 

Genewise 8S1.5.1; Putative Cu(ll)-type ascorbate- 

dependent monooxygenose (Arabldopsis) 
Multiple hits in genome, repeat region 
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Mulanl ID Plienoiype(s) Genome Pwilion and Candidale Gene(s)' No. of ble Ireerlions'' 







89: 98986-99143 

Cenie 09.16: Potential Cu-transporling ATPase type 3 

(Arabidopsis) 
Genie 89.17; Glutathione-requiring prostaglandin 

D-synthase {Callus gallus) 




CAtO07.O2,31 


Bleaches on MZ/HS 


Cenie es'^^Put'i^ivc replicition hctor (Arabidopsis) 




CAL0O7.O2.38 


Acetate requiring 


1700: 10826-11047 

Genie 1700.6: Putative NADP oxidase {Vibrio ciio/erae) 

H2A, H2B-IV) 
Genie 1700.1: Phosphoglycolate phosphatase 
chloroplast precursor (Chlamydomonas) 






Slight MZ sensitivity 


Genie 68.3: Protein phosphatase 2C ABil (Arabidopsis) 




CAL0O7,O3,02 


RB sensitive 


1380: 4193-4169 




CAI.007.03.03 


HL sensitive, RB sensitive 


Multiple hits in genome, repeal region 


1 




Slight MZ sensitivity 


2640: 5102-5626 

Genie 2640.0: 70-kD he.il shock protein 




CAL007.03.10 


MZ sensitive 


Multiple hits in genome, repeat region 


; 


CALO07.03.21 


Acetate requiring 


S90: 28275-27677 

Genie 590.2 and 590.3: repair cndonuclease 
[Arabidopsis) 




CAL0O7.O3.22 


MZ sensitive, slight 
RB sensitivity 


276: 20397-20609 

Cenewi$e.276.32.1 Dynein 11 -kD light chain flagellar 

outer arm (Chlamydomonas) 
Genie 276.2 cgcr-4 protein (Chlamydomonas) 


1 


CA 1.007. 03 ,26 


MZ sensitive 


Fragment 1 : multiple hits, repeat region 
Fragment 2: 228: 4384-4683 




CALOO7.03.32 


Acetate requiring, HL sensitive, 
rescued by high COj 


3868: 836-879 


2 


CAL007.03.34 


requiring, rescued by high COj 






CAL007.03.'I1 


Acetate requiring, RB sensitive 


Fragment 1: 199: 124-58 

Genie 199,1: pho5phoeno/|}yruvate-dependent sugar 
phosphotransferase system 




CAL007.03.43 




248: 42663-48811 (discontinuous) 
Cenie 248.8: RBCS2 




CAL007.a3.45 


Acetate requiring 


3: 156677-156894 


2 


CAL007.03.46 


Acetate requiring, partially 
rescued by high COj 






CAL007.03,47 


Acetate requiring, HL sensitive 


45: 151884-152295 




c:aloio.oi.o2 


RB sensitive 


732: 12958-12763 

Cenie 732.2 Autolysin (gametolysin) precursor 
(Chlamydomonas) 




CALOIO.01.10 


RB sensitive, pale green 


1214; 1678-1021 (discontinuous) 




CAi-oio.oi.n 


LL sensitive 


62; 19180-19068 
Genie 62.4 

Genie 62.5 Succinate dehydrogenase (ubiquinone) 
iron-sulfur protein precursor (Drosophila) 


1 


CALOIO.01.21 


RB sensitive, MZ sensitive 


23: 50197-50354 




CAL010.01.31 


RB sensitive 


102: 2403-2506 

Genewise 102.30.1: Calmodulin-binding protein 
(Arabidopsis) 


n.d. 



"Determined by comparison with the Chlamydomonas nuclear genome sequence, version 1.0 (http://genome.jgi-psf.org/chirel/chlrel. 
home.html). Alignment of the flanking sequence with the genome sequence is indicated by scaffold number (In bold) followed by sequence 
range in base pairs. ''Number of bh insertions determined by DNA gel-blot analysis, n.d., Not determined. 
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genome sequence (version 1.0; http://genome.jgi- 
psf.org/chlrel/chlrel.home.html) and to Chlamydo- 
monas ESTs if no genome similarity was fotmd. Of 
the 50 mutants presented, only 2 did not show similar- 
ity to any region in the genome sequence, and 1 of these 
showed similarity to an EST sequence. Because the 
integration of transforming DNA in the Chlamydomo- 
nas nucleus is sometimes accompanied by a deletion at 
the site of insertion, candidate genes in Table II were 
identified based on gene models that occur within a 
10-kb interval beginning at the insertion site and ex- 
tending in the direction of the ble insert. The identifica- 
tion of candidate genes was limited somewhat by 
incomplete assembly and annotation of the genome. 

Nevertheless, likely candidate genes could be iden- 
tified for several mutants (Table II). In the acetate- 
requiring phenotypic class, putative mutants were 
isolated in the Rubisco small subunit (RBCS) locus 
(see below), the ATP synthase S-subunit gene 
(CAL007,02.03), and a gene involved in the phospho- 
eno/pyruvate-dependent sugar phosphotransferase 
system (CAL007.03.41). In the pigment-deficient mu- 
tant class, CAL007.01.09, which is yellow in the dark, 
was shown to have an insertion in a gene exhibiting 
homology to the Synechoajstis sp. PCC 6803 carotene 
isomerase or crlH gene. Among mutants tliat are 
sensitive to RB and/or MZ, candidate genes include 
2 heat shock protein genes, HSPWl (CAL007.01.42) 
and HSP70A (CAL007.03.08), a putative sigma-class 
glutathione S-transferase gene (CAL007.02.27), a 
putative digalactosyldiacylglycerol synthase gene 
(CAL007,02.10), and a putaHve Cu(II)-type ascorbate- 
dependent monooxygenase gene (CAL007.02.19). Mu- 
tai^t CAL0D7.01.17 has an insertion downstream from 
a gene showing homology to the uroporphyrin III- 
synthase gene {hemD) from Synechocystis. This mu- 
tant is sensitive to MZ and bleaches on minimal 
medium in HL, suggesting that siroheme and/or 
vitamin B12 may be involved in the response to 
superoxide and photooxidative stress. 

Interestingly, several mutants were found to have in- 
sertions at or close to the RBCS locus, which contains the 
RBCSl and RBCS2 genes. For pSP124S (111 sequences 
in total), this was found in 3 independent mutants 
(CAL005,01.13, CAL005.01.26, and CAL007.03.43). The 
pSP124S plasmid contains promoter, 3' -untranslated 
region, and intron sequences from RBCSl (Fig. 2). The 
lines CAL005.01.13 and CAL005.01.26 both have a 
light-sensitive, acetate-requiring phenotype consis- 
tent with a deletion of both RBCS genes (Khrebtukova 
and Spreitzer, 1996). Genetic analysis showed that 
CAL005.01.13, reported previously as diml (Dent 
et al., 2001), is tagged by the transforming DNA (Table 
III), Isolation of the flanking sequence from both sides 
of the insert by plasmid rescue showed that a deletion 
of approximately 36 kb of genomic DNA has occurred 
in CAL005.01.13, and this deletion affects the entire 
RBCS locus (Dent et al., 2001). Subsequent work with 
this mutant has shown that the phenotype can be res- 
cued by complementation with either the RBCSl or 

Plant Physiol. Vol. 137, 2005 



Table HI, Genetic 


analysis of insertionul mu 








emmvnms 


Ola TOKeny 




Mulanl ID 


Progeny from F 


OBcny from 






Complcle Telrads Into 


niplelc; Tolrads 


-inkage 


CALOO.S.01.13 


0/16 


0/25 


Yes 


CAL005.01.15 


11/16 


■14/23 




CALOOS.01.16 


0/1 <I1 


- 


Yes 


CALOOS.01.21 


0/20 






CAL005.01.26 


0/48 


- 


Yes 


CAL005.01.28 


0/16 


0/31 


Yes 










CAL007.01.08 


0/AO 


0/1 S 


Y« 


CAL007.01.09 


0/16 




Yes 


CAL007.01.13 


2/8 


4/26 




CAl.007,01.20 




8/27 




CAL007.01.24 


9/36 






CAI.007.01.JO 


0/36 






CAL0D7.01.39 


18/40 






CAL007.02.02 


0/48 






CAL007.02.0S 


14/44 






CAL007.02.3B 


18/44 


10/25 




"Recombinants 


nclude zeocin-sensitive 


progeny that 


have the 


screened phenotype and zeocin-resistant 


progeny (hat 


lack the 



screened phenotype. Dash indicates no progeny of that type. 



RBCS2 genes (R.J. Spreitzer, personal communication). 
The flanking sequence from CAL007.03.43 showed the 
insertion to be immediately downstream of the RBCSl 
gene, suggesting that the RBCS locus is intact in this 
mutant. Consistent with this analysi.s, the mutant does 
not have an acetate-requiring or light-sensitive pheno- 
type, although it is MZ sensitive (Table II). 



Genetic Analysis 

To analyze the frequency with which the mutation is 
linked to the transforming DNA in the population of 
screened mutants, several mutants were crossed to an 
mt- wild-type strain. The progeny were then ana- 
lyzed for cosegregation of the zeocin-resistance phe- 
notype with the phenotype characterized during the 
screening procedure. Table III shows the linkage 
results of 17 crosses. A total of nine mutants (52°/d) 
showed no recombinant progeny, demonstrating link- 
age of the screened phenotype to the transforming 
DNA. With tlte number of progeny analyzed and 
assuming an average of 100 kb/cM in Chlamydomo- 
nas (Kathir et al., 2003), the transforming DNA would 
be inserted within 50 to 100 kb of the gene or genes 
resulting in the screened phenotype. More progeny 
would need to be analyzed to state with certainty that 
the mutation is indeed tagged. 



DISCUSSION 

The last 10 years have heralded the sequencing era 
in biology. As more and more genome sequences 
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become available, one of the most significant findings 
being revealed is the large number of genes for which 
no function is known or can be predicted by sequence 
similarity alone. Inactivation of a gene is generally the 
most direct way to understand its function. An essen- 
tial tool for the functional analysis of sequenced 
genomes is therefore the ability to create loss-of- 
function mutations for all of the genes (Alonso et al., 
2003). Thus far, this has only been achieved for the 
unicellular budding yeast Saccharomyces cereviske 
(Giaever et al., 2002), utilizing targeted gene replace- 
ment via homologous recombination. Unfortunately, 
this tool is not available in many eukaryotic organ- 
isms. Gene silencing has recently been employed to 
study the role of approximately 86% of the predicted 
genes in the Caenorhabditis elegans genome (Kamath 
et al, 2003). However, l^A interference-based meth- 
ods of gene inactivation have several drawbacks, 
including the lack of stable heritability of a phenotype 
and variable levels of residual gene activity. For 
organisms in which homologous recombination is 
not available, therefore, libraries of sequence-indexed 
insertional mutants have many advantages (Parinov 
and Sundaresan, 2000). Although insertional muta- 
genesis has been used successfully in the generation of 
mutant libraries in animals (Kaiser and Goodwin, 
1990; Zwaal et al., 1993; GoUing et al., 2002), their 
strength has been most convincingly demonstrated in 
plants (Alonso et al., 2003). Large mutant collections 
exist for both T-DNA and transposon insertional lines 
in Arabidopsis (Sundaresan et al., 1995; Tissier et al, 
1999; Sessions et al,, 2002; Alonso et al., 2003), maize 
(May et al., 2003), and rice (Kim et al., 2004; Kolesnik 
et al., 2004; Sallaud et al., 2004). These banks are 
invaluable resources for establishing gene function in 
higher plants (0stergaard and Yanofsky, 2004). 

To develop a resource for functional genomics of 
photosynthesis in Chlamydomonas, we have initiated 
a project to generate, screen, and obtain the flank- 
ing sequence from insertional mutants that exhibit 
photosynthesis-related phenotypes. This article details 
the phenotypic, molecular, and genetic characteristics 
of a subset of these mutants. Phenotypic analysis of the 
mutants confirmed that initial selection of transform- 
ants and subsequent maintenance of the mutants in 
the dark allows for the recovery of a large class of 
light-sensitive mutants (Table I), which might other- 
wise have been overlooked if nonphotosynthetic 
mutants were isolated by screening of light-grown 
cultures on minimal media (Spreitzer and Mets, 1981). 
Maintenance in the dark, however, may lead to the 
accumulation of light-sensitive, spontaneous muta- 
tions over time. This was found in the case of the 
CAL007.01.09 mutant (with an insertion in the caro- 
tene isomerase gene), which acquired an additional 
light-sensitive mutation that was revealed during 
genetic analysis. To minimize this problem, mutants 
are stored in liquid nitrogen or as a mated zygospore 
stock as soon as possible after isolation. RB and MZ 
were shown to be useful for the isolation of mutants 



that are sensitive to generators of reactive oxygen 
species. The choice of these two compounds was also 
found to be effective in differentiating the response to 
specific reactive oxygen species, as only one-third of 
the total number of RB- or MZ-sensitive mutants were 
found to be sensitive to both chemicals. 

Molecular analysis of the mutant population showed 
that only approximately 30% of the mutants contained 
insertions of the ble gene at more than 1 locus, with an 
average number of insertions per clone of 1.4. This is 
comparable to Arabidopsis T-DNA mutant collections, 
in which the average number of T-DNA insertions per 
line is reported to be approximately 1.5 (McBlver et al., 
2001; Sessions et al., 2002; Alonso et al., 2003). Although 
a higher number of insertions per mutant means that 
fewer mutants are required to saturate the genome, 
isolation of the flanking sequence becomes more diffi- 
cultwhen PCR-based techniques are used. In addition, 
the presence of numerous insertions per clone often has 
a negative impact on the mating ability of a clone and 
necessitates backcrossing to isolate the relevant muta- 
tion. It is tlierefore advantageous to maximize the 
number of clones with single inserts for both molecular 
and genetic reasons. 

Genetic analysis showed that, in approximately 50% 
of the insertional mutants, the phenotype cosegre- 
gated with the transforming ble gene (Table HI). This is 
in agreement with other insertional mutagenesis stud- 
ies in Chlamydomonas (Niyogi et al., 1997; Fleischmami 
et al., 1999; Moseley et al, 2000), The tagging fre- 
quency in Chlamydomonas insertional mutagenesis 
therefore compares well with that reported for Arabi- 
dopsis T-DNA transformation, where as few as 35% of 
the mutants in a population may be tagged (McElver 
et al., 2001). It should also be noted that mutants in 



Table IV. TAIL-KR cycling parameters used to isol. 


(e flanking 


DMA from insertional 






Reaction Slop 




No. of Cvtlcs 




9S°C, 2 min 


1 


2 


94°C, 1 min; bZ'C, 1 min; 


5 




H'C, 2,5 min 




3 


94'C, 1 min; 25°C, 3 min; 












3 min; 72°C, Z..'; min 




4 


91°C, 30 s; 68°C, 1 min; 


IS 




72°C,2.5 min; '■)rC,3Qs; 






bO'C. 1 min; ?2''C, 






2.5min;94''C,30 5;44'=C, 






i min; 72''C, 2.5 min 




5 


72'C, 5 min 


1 


Secondary 1 


94 "C, 30 s; 64«C, 1 min; 


12 




72''C,2.5min;94'>C,30s; 






64X, 1 min; 72 °C, 






2.Smin;94°C, 30 5;44<'C, 






1 min; ?2'C, 2.5 min 




2 


72'>C, 5 min 


1 


Tertiary 1 


94°C, 30 s; A4°C, 1 min; 


20 




72''C, 2.5 min 






72°C, 5 min 
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which the screened photosynthesis-related phenotype 
is not tagged may be of interest in otlier fields 
of Chlamydomonas biology. For example, mutant 
CAL007.03.22 was found to contain an insertion adja- 
cent to the gene encoding the 11-kD dynein light chain 
of the flagellar outer arm. It is unlikely that this would 
lead to the observed MZ- and RB-sensitive phenotype, 
but the mutant may have a linked motility phenotype 
that would not have been detected in our screening 
procedure. 

Molecular analysis of the mutant population also 
revealed that, although all mutants analyzed had at 
least 1 copy of the ble gene, only approximately 50% of 
mutants had a band hybridizing to the origin of 
replication from the pBluescript region of the trans- 
forming plasmid, PGR screening also indicated that 
even fewer clones contained a fully intact origin of 
replication and ampicillin resistance gene (data not 
shown), suggesting that deletions affecting the trans- 
forming DNA occur frequently upon insertion into the 
Chlamydomonas genome. This illustrates why plas- 
mid rescue has been a difficult technique to use in 
foi-ward genetics studies in Chlamydomonas, as se- 
quences required for the maintenance of the plasmid 
in Escherichia coli are frequently lost. In addition to the 
fact that plasmid rescue is not easily modified to 
higher throughput approaches, the above problem 
also explains why TAIL-PCR is the method that we 
have chosen for the isolation of the flanking sequence. 
Although PCR-based techniques are often difficult to 
optimize in Chlamydomonas due to the GC-rich 
nature and high occurrence of repeat regions in the 
genome, this article reports that TAIL-PCR was suc- 
cessful in amplifying fragments in almost 80% of the 
mutants analyzed. The only drawback of TAIL-PCR is 
that it cannot amplify through tandem arrays of 
inserts, and these occurred in approximately 15% to 
20% of insertional mutants. This, however, compares 
favorably with Arabidopsis T-DNA mutant collec- 
tions, in which 25% of left-border products and 62% 
of right-border products have been found to contain 
only T-DNA sequence (Sessions et a!., 2002) using 
TAIL-PCR. Thus, the advantages of TAIL-PCR for 
higher throughput strategies outweigh its drawbacks. 

Since the long-term aim of this project is to saturate 
the Chlamydomonas genome with mutations affecting 
photosynthesis, several other criteria in addition to 
insert number need to be examined. The number of 
insertional mutants required to saturate the genome is 
also dependent on the size of deletions that may occur 
at the site of insertion; larger deletions have the 
potential to affect multiple genes. Deletions of genomic 
DNA occurring at the point of insertion in Chlamydo- 
monas range in size, but can be as large as 50 kb (Tanaka 
et al, 1998). The population described here appears to 
follow the same pattern. The mutant CAL007.01.15, for 
example, has a deletion of 36 kb (Dent et al., 2001), 
whereas CAL005.01,20 has only a few base pairs de- 
leted at the site of insertion (data not shown). Calcu- 
lations of the number of mutarits needed also assume 
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that insertion is a random event (Clarke and Carbon, 
1976). Whether insertional mutagenesis is truly random 
in Chlamydomonas has also not been examined in 
previous studies. T-DNA insertion in Arabidop.sis has 
been found to show bias against both predicted coding 
sequences and centi'omeres and to occur in preferred 
sites of integration or hot spots (Barakat et al., 2000; 
Sessions et al, 2002; Alonso et al., 2003). This work 
reports that, of the 50 flanking sequences isolated, 3 
were found to be clustered within 50 kb of the -RBCS2 
locus, and 2 were found associated with histone clusters 
(Table II). It is therefore possible that there is some site 
bias during insertional mutagenesis in Chlamydomo- 
nas, and this may be related to either sequence com- 
position of the transforming DNA or variation in 
recombination frequency across the genome. It might 
be possible to minimize the impact of site bias in the 
mutant collection by using a variety of plasmids and 
selectable marker genes for insertional mutagenesis 
(Randolph-Anderson et al„ 1998; Kovar et al., 2002; 
Depege et al., 2003). The issues of average deletion size 
and insertion site bias will need to be resolved once 
more mutants have been generated and characterized, 
thus allowing for a more accurate estimation of the 
number of mutant lines that need to be generated to 
achieve saturation. 

Over the next several years, we aim to generate and 
screen 80,000 insertional mutant lines in Chlamy- 
domonas. This will lead to the isolation of approxi- 
mately 7,000 mutants affected in photosynthesis and 
sensitivity to photooxidative stress. Flanking sequen- 
ces will be available as a searchable database within 
the Chlamydomonas Genome Project Web site (http;// 
www.chlamy.org) and, when the final genome se- 
quence is released, these sequences will be marked 
on the genome as an optional track within the browse 
function. Researchers can therefore either search the 
database with DNA sequences of interest or scan the 
genomic sequence surrounding their gene of interest 
for flanking sequence tags from mutants. The mutants 
will be available to the scientific community as mated 
zygospore stocks from the Chlamydomonas Genetics 
Center. Progeny recovered from heterozygous zygo- 
spores will represent a segregating population, which 
will allow for immediate genetic analysis of linkage 
between the mutant phenotype and the selectable 
marker used for transformation. Strains will also be 
stored frozen in liquid nitrogen to minimize the loss of 
mutants that are unable to mate. This population of 
mutants will represent the first publicly available 
catalogued collection of insertional mutants in Chla- 
mydomonas, and it will be an invaluable resource for 
photosyntliesis research. 



MATERIALS AND METHODS 

Media and Strains 

Cultures of Clilamyilomoms miiharMi cells wore grown hetcrotrophi- 
cally or photolieterotrophically in Tris-acetafc phosphate media (TAP) or 
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photoaiilotrophically in minimal high-salt (HS) media (Harris, 1989). Strain 
and miitani stocks were maintained on TAP agar medium in the dark at 25'C. 
For procedures lliat required liquid cultures, cells were grown in 50 mLTAP 
medium n-ith shaking al 120 rpm either in the dark or at a very low light 
(VLL) intensity of 3 ^mol photons m' ' s ' at 25''C, except where othcnvise 

The Chlamydomonas strain Hscd to generate the population of mutants 
was selected for its ability to grow well and remain green in the dark on TAP 
medium. The growth of the standard laboratory strains CC125 <i«(+) and 
CC124 (ml-), obtained from the Chlnmydomonas Genetics Center (Duke 
University, Durham, NC), was compared with that of strains 4 A+ and 17D-, 
which %vere obtained from J.-D, Kochaix (University of Geneva). Like CC125 
and CC124, 4A+ {ml*) and 17D- (ml-) are in the 137c wild-type strain 
background. 

For genetic analysis of the mutants generated in 4A+, a near-isogenic ml- 



m step. For DNA w: 



in (4A-), 



as 4A + , 



0 HL ai 



species gencratoi 

The strains used for the preparation of gamete autolysin were CC620 {I37c 
NM subclone, and CC621 (137c NO subclone, ml-). These were also 
obtained from the Chlamydomonas Genetics Center. 



Generation of Mutants and Genetic Crosses 

Inscrtional mutagenesis of Chlamydonion.i . < ells l('ii(>'.v'r'<l the transfor- 
mation method of Kindle el al. (1989). One of 2 plnsmid;, vv.is used for 
transformation, pSP124S (l.umbrcras et al., 1998) or pMS168 {Schroda et al., 
2002), linearteed with iJnmHI or Kpn\, respectively (Fig. 1). Transformations 
with pSP124S used 1 ;ig plasmid DNA/transformation, whereas 0.6 ;ig of 
pMS:88 were used. After transformation, the cells ivere allowed to recover in 
10 mL TAP overnight in the dark at 25°C, with shaking at 110 rpm. The cells 

TAP, and plated onto TAP agar plates containing 5 jig inU ' zcocin (In- 
vltrogen, Carlsbad, CA). The plates were maintained in the dark at 25'>C for 3 

Genetic crosses and tetrad analysis to assess Imkage of the observed 
phenotype with antibiotic resistance were performed according to established 
methods (Harris, 1989). 



Screening 

Stock plates of insertional mutants were maintained in the dark on TAP 
agar plates. Prior lo screening, the mutants were subcultured to fresh TAP 
plates and maintained at VLL at 25°C for 3 weeks. These VLl^acclimated 
mutants were used to inoculate ISO ^I- TAP in 96-well plates by replica 
plating. After 5 to 7 d of growth {VLL, 25°C), 3 fil. cells were spotted onto each 
of the following primary screen plates: (1) TAP agar; (2) HS agar; (3) 1.S mM 
M2 (Sigma, St, Louis) in TAP agar; and (4) 2 /iiM RB (Sigma) in TAP agar (Fig. 
2). The TAP plates were mai.itaincd in the dark, the M2 and RB plates were 
incubated at a U. intensity of 80 iimo\ photons m"' s"', and the HS plates 
were incubated at a HL intensity of 500 Mmol photons m ' s "'. The tem- 
perature for all treatments was 25''C. All plates were scored for cell growth 
and bleaching after 7 to 10 d of treatment. Mutants displaying reduced growth 
or bleaching under any condition were picked for secondary screening. 

For secondary screening, cells were grown and inoculated as described for 

(dark); (2) TAP agar (VLL); <3) TAP agar (LL); (4) HS agar (VLL)'; (5) HS agar 
(LL); (6) HS agar (HL); (7) HS agar (LL, with high COj); (8) HS agar (HL, with 
high CO,); (9) 2 /U.M RB in TAP agar (VLL); (10) 2 RB in TAP agar (LL); (11) 
1 (i.vi RB in TAP agar (LL); (12) 1.5 m-M MZ in TAP agar (VLL); (13) 1.5 mM MZ 
in TAP agar (LL); and (14) 1.5 m.M MZ in HS agar (LL). High COj atmosphere 
was achieved by incubating the plates in BBL GasPak COj pouches (Becton- 
Dickinson, Franklin Lakes, NJ). In addition to these 14 treatments, the 
masimum chlorophyll fluorescence of the dark-grown TAP stock cultures 
was measured using video imaging (Polle et al., 2002). 



DNA Extraction Technique 



et al, (1992), excluding the final Cs< 
TAIL-PCK, cells were collected by ce 
medium. The pellet was washed with 200 Milli-Q water, and DN 
extracted using DN Azol reagent (Invitrogcn) according to Ihe manulaci 
instructions. Tire final DNA pellet was resuspended in 100 nL Tns- 
(10 m.M Tris, pH 8.0, 0.1 mM EDTA). 



TAIL-PCR and Sequencing of Amplified Fragments 

Genomic DNA adjacent lo the insertion site of the transforming DNA was 
amplified using TAIL PCR (Uu et al., 1995). Tlie method employed was 
"hlamydomonas. Flanking DNA was only isolated from the 
>f the insertion adjacent to the bh gene in each plasmid used, as it was 
found that random deletions of pBluescript sequences from Ihe other end 
of the transformiitg DNA made amplification difficult. For pSP124S, the spe- 
cific primers for primary, secondary, and tertiary reactions were RMD223 
(S'-TTGGCTGCGCrCCTTCTGCCATn-AAATC-3'), RMD224 (S'-GCAIT- 
TAAATCTCGAGGTCCAC-3-), and RMD225 (S'-GAIAAGCrrGATATC- 
GAA'lTCC-3-), respectively For pMS188, the specific primers for primary, 
secondary, and ferlinry reactions were RMD264 (S'-GTCCTGAACCGG- 
TAGCTTAGCrCC-3- ), KMD25S (5'-CTCCCCGTlTCGTGCTG AfCAGTC- 
3-). and RMD256 (3'-GAGGAGnTlX;CAArrTTGrrCG-3 ), respectively. 
Two arbitrary degenerate primers (Wu-Scharf et al., 2000) were tested for 
amplification, RMD227 (5--NTCCWGWTSCNAGC-3') and RMD22S 
(5'-WGNTCWGNCANGCG-3'). RMD227 was found t<> an\p!ify flanking 
regions successfully in most samples, whereas RMD228 only resulted in 
fragments in 50% of samples tested. KMn227 was Ihercforc selected as the 

'primary TAIL-PCR reactions (20 ,xL) contained 1 x PCR buffer (500 mM 
KCI, 100 mM Tris-HCI, pH 8.3, 15 m.\i MgCI,), 200 /iM of each dNTP, 5 pmoi 
RMU223 or RMD2&4, depending on the plasmid used for transformation, 
60 pmol RMD227, and 2.5 units Taq polymerase (Eppendorf AC, Hamburg, 
Germany). The cycling parameters lor all reactions of TAIL-PCR are described 
in Table IV. 



specific primer was replaced with RMD224 or RMD255. For the low- 
stringency tertiary reaction, the secondary reaction was again diluted 25- 
fold and either 1- or 1-nL aliquots, depending on the level of amplification 

the specific primers RMD225 or RMD256. The amplified products from both 
the primary and secondary reactions were analyzed by agarose gel electro- 
phoresis. Reactions were purified as follows prior to sequencing. For samples 
where a single band was ampUficd, DNA from the tertiary PCR reaction mix 
was isolated using the QlAquick PCR purification kit (Qiagen, Valencia, CA). 
tf more than one band was amplified, the fragments were separated by 
agarose gel 
gel using ll 

For direct sequencing, 10 to 60 ng DNA were amplified with 10 pmol/ 
reaction of RM0225 (pSP124,S) or RMD25C (pMS188) using the DYEnamic ET 
Terminator Cycle Sequencing kit (Amersham Biosciences, Piscalaway, NJ) 
according to the manufacturer's instructions, including the optional dilution 
buffer at a 1:1 (v/v) dilution. Sequencing reactions were run on an ABI3100 
sequencer. Sequence data are available at the Chlamydomonas Genome 
Project Web site (hllp://www.chlamy.org). 



Zygospore Storage and Cryopreservation of Ceils 

All mutants that showed a phenolype after the secondary round of 

storage, each mutant was mated to the strain 4A-, and the mating mix was 
added to clay particles (unscontcd cat litter) and allowed lo dry as described 
(Harris, 1989). Cryopreservation of cultures was carried out as described 
previously (Crutchfield et al., 1999), 

Upon request, all novel materials described in this publication will be 
made available in a timely manner for noncommercial research purposes. 
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Simple, experimenially tractable systems such Saccharomy- 
ces cerevisiae, Chlamydomonas reinhardtii, and Arniwlopsis 
ilialiana are powerful models for dissecling basic biological 
processes. The unicellular green alga C. reinlmrdiii is amenable 
to a diversity of genetic and molecular manipulaiions. Tliis 
Isaploid organism grows rapidly in axenic cultures, on bolh 
solid and liquid mcdiiin), with a sexual cycle llial can be pie- 
cisely confroiled. Vegelalive diploids arc readily selected 
through tlio use of complenieiUing auxotrophic markers and 
are u.wful for analyses of deleterious recessive alleles. These 
genetic features have perniilted the generation and character- 
ization of a v/ealth of mutants with lesions in structural, met- 
abolic and legulalory genes. Another important feature of C. 
rcinhardlii is that it lias the capacity lo grov.' with light as a sole 
energy soui'cc (pholoaulolrophic growth) or on acetate in the 
dark (hclcrolrophically), facilitating detailed examination of 
genes and proteins critical for photosynthctic or respiralory 
function. Other imporianl topics being studied using C. rein- 
hardtii, many of wliich have direct application to elucidalion of 
protein function in animal cells (26), include flagellum stiuc- 
lufo and assembly, cell wall biogenesis, gamelogenesis, mating, 
phololaxis, and adaptive responses lo light and nutrient envi- 
ronments (32, 4'!). Some of these studios are directly relevant 
to applied problems in biology, including the production of 
clean, solar-gcneratcd energy in the form of Hj, and bioveme- 
diaiion of heavy metal wastes. 

Recent years liave seen iJic development of a molecular 
toolkit for C, reinhnrdlii (42, 44, 06, 98, 99). Selectable markers 
are available for nuclear and chloroplasl transformation (4, 5, 
12, 13, 30, 44, 56, 82). The/l/j/ (22) and Nil] (30) genes are 
routinely used to rescue recessive mutant plienotypcs. The 
bacterial We gene (which codes forzeocin resistance [70, 112]) 
is iin easily scored jnarker for nuclear transformation, and the 
bacterial aad/\ gene (which codes for spcclinomycin and strep- 
tomycin resistance) is a reliable marker for chloroplasl trans- 
formation (39). Nuclear transformation can be achieved by 
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particle bombardmeni (22, 23, 57, 73), agitation with glass 
beads (5<5, 81), or elcctroporation (105, 121). Generation of 
fagged insertional mutations by nuclear transformation has led 
to the rapid identification of mutant alleles (3, 17, 20, 21, 60, 
108, 109, 120, 132, 138). Plasmid. cosmid (92, 139), and bac- 
terial artificial chromosome (BAC) (66) llbriirics arc used lo 
rescue nuclear mulnlions. Expression of specific genes can be 
repressed using both antiscnse (65, 103) and RNA interference 
lechnologies (.50, 58, 107; N. F. Wilson and P. A. Lefebvre, 
abstract prcsciilcd at the 10th International Chlamydomonas 
Conference, 2002). In addition, endogenous Iransposable ele- 
ments (31, 102, 127), marker rescue o{ Eiclieridiia coli mutants 
(89, 136), direct rescue of C. reinhardtii mutants (38, 94, 132), 
and map-ha.sed techniques arc being used to clone specific 
genes. Chloroplasl transformation (12, 83) has perniilted dis- 
ruption (118) and site-specific mutagenesis of gene,? on the 
chloroplasl genome (33, 34, 35, 43, 45, 46. 63, 64. 76, 129, 131, 
134, 140). Reporter genes such as green fluorescent protein 
(36, 37), /(ra (arylsulfaiasc) (19), and Luc (luciferase) (77; M. 
Fuhnnann L. Ferbilz, A. Eichler-Stahlberg, A Haushcrr, and 
P. Hcgetnann, abstract presented at the KHh IniernaiionaJ 
Chlamydomonas Conference, 2002) arc helping to ckicidaie 
processes such as transcriptional regulation (16, 49, 87, 93, 
125) and polyadenylation-mcdiaied chloroplast RNA decav 
(59). 

Ongoing genome projects ofi'er the scientific communily a 
wealth of information concerning llie sequence and organiza- 
tion of the C. reinlwidlii genome. Combined with llie molecu- 
lar toolkit, these data expand our ability to analyze gene func- 
tion, organization, and evolution and lo examine how 
environmental parameters and specific mutations alter global 
gene expression. 

Generation of C. reinhardtii expressed sequence lag (ES T) 
information was initiated in Japan (www.kazHsa.or.jp/en/planl 
/chlamy/EST), and augmented by a National Science Foundation 
supported project (wv,'w.biology.duke.edu/chlamy_gci)onic/) lhat 
lias generated over 200,000 additional sequences assembled into 
over 10,000 "unique" cDNAs (106; tmpublished data). IVIicroar- 
rays with rcprcscnialion for all of llie plaslid genes and approxi- 
mately 3,000 nuclear genes (48, 68) liave been used to probe 
global gene expression in wild-type (48, 68) and mutant strains (Z. 
Zhang and A. R. Grossman, unpublished results). Fuilhei-more, 
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the genomic iiifoimalion has aided in Ibc generation of tOo)s for 
map-based cloning, based on linl<age of genetic and physical 
marker,'! (55, i26). 

The accumulation of cDNA sequence information and de- 
vclopmenl of robust molecular markers has siiinulalcd the 
interests of the Joint Genome Institute (JGI), Department of 
Energy, and under the leadersliip of one of us (D. Rohksar), a 
rough draft of the near-complete genome sequence was made 
publicly accessible in the early pan of 2003, This sequence has 
been partially annotated and both cDNA infoi mation and mo- 
lecular markers have been anchored to the sequence. These 
advances have dramatically enhanced the utility of C. rein- 
hardiii as a model system. 

NUCLEAR GENOME SEQUENCE 
Assembly and annotation of the genome, The nuclear ge- 
DOifle of C. reinlmdiH is 100 lo 110 million bp, comprising 17 
genetic linkage groups (55), with a very high OC content 
(nearly 65%) that rcwits in cloning dilflcuUies and limits the 
lengtli of reads from sliotgun sequencing reactions. Generating 
a high-qualily genome sequence has therefore presented un- 
usual cliallcnges. Sequencing siialegics being used involve pro- 
duction of random genomic fragments of ~3 and -6 kbp, 
cloning of the fragments into plasmids, and obtaining paired 
end sequences of the insert DMA. Paired end sequences from 
35 to 40 kbp fragments in fosjiiid vectors arc also being gen- 
erated. This information is being inlcgraled with end sequence 
data from 15,000 BAG clones (sec "Aligomcni of Genetic and 
Physical Maps"), 

With a sequence redundancy of nearly 10-fold, the randomly 
sequenced fragments generated by the strategies described 
above can be assembled into "conligs" (contiguous stretches of 
reconstructed sequence obtained from overlapping end se- 
quences) thai arc further linked together into "scaffolds" 
(longer sueichcs of rccon.'siructed sequence interrupted by 
"gaps" whose size is roughly known based on spanning clones). 
A preliminary rough draft of the C. reinhwdtii genome is al- 
ready available at the JGI Chlamydomonas Web site (sec be- 
low). A high-quality draft genome assembly is anticipated by 
the fall of 2004. 

We plan lo generate a complete sequence reconstruction of 
C. reinhaidlii chromosomes by linking logcthcr sequence scaf- 
folds using genetic and clone-based physical maps (sec "Align- 
ment of Genetic and Physical Maps"). Sequencing of selected 
regions of the genome is likely to be fini.shed by fuvllier lar- 
gcted efforts to close gaps in scaffolds and by resequencing 
low-quality regions to achieve a uniform error rale of less iJian 
one error per 10,000 bases. The ultimate goal Is a high-quality 
reference genome sequence. 
Annotation of the gene content of C. mnlmrdiii is being 



facilitated by copious EST information produced by several 
projects (see below) and availability of modern gone-finding 
methods that exploit expressed sequence evidence, statistical 
signatures of coding regions, and conservation of deduced 
polypeptide sequences with known proteins from oilier organ- 
isms. One intriguing possibility for furtlicr analysis of the C. 
reinhnrdlii genomic sequence is to compare it with sequence 
intoniiation from llie colonial alga Volvox cimeii, with the goal 
of highlighting coding regions thai may be unique to the chlo- 
rophylc algae, and possibly to identify putative conserved reg- 
ulatory regions. While coinputalionai methods can only reli- 
ably predict coding regions, the large scale EST collections will 
enable many S' and 3' untranslated regions (UTRs) to be 
directly determined. Furtbermorc, probes synthesized based 
on ab initio gene predictions can be used lo identify and clone 
rare transcripts. Integration of complcmentaiy community in- 
formatics resources centered on the genome will provide a 
comprehensive view of the C. reinhardiii genome lhal is readily 
accessed by many different network locations (see "Toward an 
Integrated Databa.w"). 

Chlamydomonas Genome Portal. Genomic information 
generated at JGI can be accessed through the JOI Chlamydo- 
monas Genome Portal (www.jgi.doe.gov/clilamy), which is in- 
tended as an archival, Web-bascd source foi- C. mnlwdtii 
gcnoniic sequence information and associated annotations 
(Fig. 3A). Prior to initial publication of the genome sequence 
and its annotation and analysis, items presented on the JGI site 
should be considered to be preliminary results and a commu- 
nity resource. 

Various precalculaicd features Identilled on the genome 
(exons; genes; inRNA, EST, or unigene alignments; markers 
for mapping; protein BLAST hits; etc.) arc organized in 
"tracks" using a graphical interface similar lo thai developed al 
Santa Cruz for the human genome (S-f) (Fig. IB). ClickiDg on 
(sclcciing) a predicted gene will display a page (Fig. IC) show- 
ing protein and transcript sequences, piecalculalcd BLAST 
results (1), and InlcrPro (79) determinations of protein do- 
mains. Clicking on an EST, unigene, or mRNA alignment 
displays a graphical view of the alignment as well as informa- 
tion at the sequence level and BLAST results relative to known 
proteins from various organisms. 

Users can reach a genomic region of inlercsi in a variety of 
ways. One can perform BLAST analysis against the genome 
and view resulting alignments in the context of all ilie other 
database features. For example, comparing anArahidoi>sis pro- 
tein to the C. mrt/imW^fV genome with BLAST would access the 
region of the C. reinhardiii genome with a similar .sequence, 
immediately recovering the gene at that location. There are 
tracks for predicted gene structures based on the Gene Wise 
(9) and GrecnGenie (Susan Dutcher, personal communica- 



FIG. 1. (A) Schematic of JGI gonomo portal. The diagram sl)Ow.<; ihc iniernal connections of the JGI Genome Portal. Inronnation on BLAST 
rcsulLs, EST aligiimcnls, and gene models can be accessed llirougJi the Search page. From the gene model information page. Or p/oiein pa£c, 
InterPro domains and Sntith-Walcrman alignments to protein databases arc displayed with a graphical inlcrfacc. Wiih llic version 2.0 relea.ie GO 
and KEGG will be available, as well as the ability lo annotate gene models, Tlic a>laniydomonas Genome Portal is accessihlc at www.jgi.doo.gov/ 
chlamy. (B) Browse view. Screen shot of the browse view for several gene models displayed on lite genome. Displayed simiiliaiicoiisly arc 
overlapping EST alignments and Blastx results. (C) Protein page. The protein page displays inforninlion aljoul a gene model. ImerPro rcsiiiis, 
Smith-Wolcrman alignments, and the protein and iranscripi .wqucnce for this model can be retrieved from Ihis page. 
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TABLE 1. cDNA libraries 



Core TAP liglii, TAP dark, HS + CO3, HS 

Core TAP liglii, TAP dark, US CO,, HS 

Core TAP liglil, TAP dark, HS -i- CO,, HS 

Stress I NO3 10 NH, (30 min, 1 and 4 h),' NH, 

!0 NO, (30 min, 1 and 4 h), TAP-N 
(30 min, 1 and A h), TAP-S (30 min, 
I and 4 b), TAP-P (4, 12, and 24 h) 
Sircss I) NHa 10 NO, (24 h), Mj prodiicdoii (0, 

12, and 24 h), TAP + H,Oj (I, 12, 
and 24 h), TAP + sorbitol (1, 2, 6, 
and 24 h), TAP + Cd (1, 2, 6, and 
24 li) 

Dcnagcllailon 15, 30, and 60 min 

SID2 

Gameic 2, 8, 10. 12. 15, and 17 h 

Zygote 30 and 60 min 

Sircss III TAP-Fe, TAP-Cu, TAP-Oj, TAP high 

ligbl, HS high llgin 



21gr Not normalized 874 768 

21gr Normalized 894 10.080 

21gr Sublracicd (894) 1024 12,096 

21gr Normalized 963 12,000 



21gr Normalized 1031 10.752 



21gr Normalized 1030 12.480 

SID2 Normalized 925 124 

21gr Normalized 1112 — ' 



21gr Normalized 35i0 



lion) algorillims, as well as for alignments of publicly available 
ESTs (106), molecular markers (55), array elements, and 
kiwvii protein sequences from specific organisms. 

Since a BLAST analysis of the genome against all proteins in 
GenBank has already been performed and will be periodically 
updated, ore can iex( search through the names of piecom- 
puleiJ alignmenls. Other access points include the GO (gene 
ontology) and KEGG (Kyolo Encyclopedia of Genes and Ge- 
nomes) links that organize genes into functional groupings. 

The JGI Chlamydomonas Genome Portal is in a dynamic 
state of development, Assignment of gene functions is a fea- 
ture of any genome project that is continually being informed 
by sequence similarities, experimental evidence, phylogenelic 
data, and expression profiles. To capture the riches! annotation 
of the C. yeinharddi genome, the JGI portal includes interfaces 
for community annotation, allowing experts around the world 
to add their input, and iiKorporales links to publications, ex- 
periments, and descriptive text. New features being integrated 
into the JGI Portal include tracks showing spanning BACs and 
fosmids. Improved gene models will be merged ab Inilio with 
BST/tnRNA evidence, Increasing the number of complete 
gene predictions (including UTRs) and revealing allcrnalively 
spliced transcripts. Sequence signals for transmembrane span- 
ning regions, signal peptides, and targeting sequences will also 
be computed and added to the site. Linkages lo and from JGI 
pages to other community resources, notably ChlamyDB, are 
being developed, as described below under "Toward an Inlc- 
graled Database." 

TtfE TRANSCRIFrOME 

Efl'oits arc currently under way lo identify transcribed re- 
gions of the genome and lo analyze their expression patterns. 

cDNA Information. After a pilot experiment by S. Purlon, a 
collection of 37,940 S'-cnd ESTs was generated for C rein- 
haitttii by the Ka^usa DNA Reseaich Institute in Japan (2). 
Normalized, size-selected libraries were generated from cells 
grown under low- or high-COj conditions, A National Science 
Foundation-supported cDNA project performed at the Carnegie 



Institution of Washington and the Genome Technology Center at 
Stanford has led to the generation ofcDNA libraries constructed 
from RNA isolated from cells exposed lo a variety of different 
conditions (Table 1); these libiarics were normalized prior lo 
sequencing individual clones. One libi ar>' is from the field isolate 
S1D2 (4i), which has numerous sequence polymorphisms but is 
inlcrfei lile vvilh the laboratory strain 21gr, and is u!;ed for map- 
based cloning of mulnnl alleles (55). Nearly 200.000 clones have 
been sequenced from their 3' and 5' ends (106), and full-length 
sequences are being generated. Our assembly protocol is based 
on the commonly used Phrap program, which takes into accouiit 
.sequence quality. The assembly generates assemblies of contigu- 
ous ESTs (ACEs), which theoretically represent unique genes 
(106; J. Shiager, C.-W. Ciiang, J. Davies, E. H. Harris, C. Hauscr, 
R. Tamse, R. Sur^ycki, M, Gurjal, Z, Zliang, and A. R. Gi'oss- 
man, presented at the pi occcdings of the 12lh International Con- 
gress on Photosynthesis, 2001) (www.biolog)'.dukc,edu/chlamy 
/PDF/Shragcr2003.pdf). Sequences from the -10,000 ACEs in 
the assembly designated 20021010 (dated 10 October 2002) 
have been annotated on the ba.sis of BlasiX homology to po- 
tential homologs in other organisms. Wc arc currently prepar- 
ing a final assembly of all of the EST data, whicn win include 
those from S1D2 as well as from the Purlon and Kaxusa 
pioject.s. Knowing the distribution of ESTs among the cDNA 
libraries and the Conditions used for library generation, wc can 
infer a qualitative image of the expression pattern of specilic 
genes. Accordingly, wc have identified several genes repic- 
."lemcd by multiple cDNAs in the stress libraries (including 
asylsulfalase, phosphatases, and regulalor)' proteins) thai are 
not represented in Ihc core librai-y. 

Mlcroai-ray construction and application. The DNA mi- 
croarray is currently the most commonly used and widely ap- 
plicable technique for the global analysis of gene expression. 
We have completed and are distribuling a first generation 
cDNA array. A region of each cDNA 3' end was amplified 
using a universal primer in the vector and a specilic primer 
-400 bp upstream of the 3' end. PGR products were purified 
and printed onto GAPS 11 amino silane-coated slides (Corn- 
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ing), wilh each slide carrying four replicate spots of eac|} 
cDNA fragment. For version 1.0, we chose clones with high- 
qualily sequence information from 2,761 distinct ACEs. As of 
Januaiy 2003, a slightly different version is being distributed 
(version I.l), will) -300 additional genes amplified either from 
our EST libraries, or from other sources; many were kindly 
provided by other laboratories. Within 2 years we plan to 
generate an array representing the entire C. reiithardlii gc- 

We and others have already performed experiments with 
these arrays. Recently, we have identified genes activated by 
liigh-inlensity light under low-COj conditions (48); these genes 
encode pliotorespiraloi-y proteins, proteins that combat the 



accumulation of toxic oxygen radicals, polypeptides that func- 
tion in concentrating inorganic carbon and several proteins of 
unknown function. Expression studies have also been per- 
formed with wild-type and mutant cells transferred from nu- 
trient-replete to sulfui-dcficient medium. For example, the 
5acl gene controls the acclimation of ceils to sulfur deprivation 
conditions and encodes a regulatory protein (17, 18) that has 
some similarity to transporters with 12 membrane-spanning 
helices. Figure 2 shows a set of microarrays gei\erated for 
CC-A25 and the ,t(?c7 mulfint following imposition of sulfur 
deprivation. A numbej' of transcripts were fo\ind to increase 
dramatically during starvation. Some encode proteins involved 
in sulfur metabolism (e.g., the An gene [which encodes ary\- 
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sulfausej and the Aisl gene [which encodes ATP sulfui-ylase]) 
or oiher cellular processes (e.g., Ecp76, which encodes a cell 
wall polypeplide specific lo sulfur stress cells (116)), while ihe 

funclions of several others remain unknown (Fig, 2), 

Similar sludies are being conducled (in llie Grossman labo- 
ratory), on phosphorus and nitrogen limitation, as well as on 
the pliysiological effecls of different light qualities. Other mi- 
croarray studies have been initialed with Krishna Niyogi (high- 
light-activated genes), Donald Weeks (COi-activatcd genes), 
and Jean-David Roehaix (mutants in photosynthelic function). 
We have also distributed several hundred arrays to researchers 
working on C. remhardiii, and it Is expected that a largo corpus 
of data will be generated in the coming months (hat should 
begin lo reveal global and interacting regulatory features of the 
genome. A specific nDicroarray section is being introduced into 
the Chlamydomonas Genome Project Database in which all 
relevant Information regarding array elements (sequence, po- 
sition on the array, ACE and gene models and (heir annotation 
etc.) will be made available. 

AUGNMENT OF GENETfC AND PHYSICAL MAPS 

An important component of the genome project has been 
the placement of molecular markers onto the C. imlmrdtii 
genetic map, whh the aim of facilitating map-based cloning of 
genes idenfilicd by mutations. Over the last SO years, more 
than 200 phenotypic markers (mostly mutations) have been 
mapped onto the 17 C reinhavdlH linkage groups, and recently, 
more than 270 molecular markers have been placed on the 
linkage map. Some of these have been correlated with mutant 
data, allowing for the alignment of the pliysicai and genetic 
maps. The defined physical markers arc either restriction frag- 
ment length polymorphism- or PCR-based markers. The posi- 
tioning of these markers onto linkage groups provides, on 
average, a map in which any given point on the C. reinhardlii 
genome is within 2 cM of a mapped molecular marker (55, 
126), corresponding to 150 lo 200 kbp of genomic sequence. 

To facilitate the use of ilie molecular map for map-based 
cloning, a BAC library of more than 15,000 clones has been 
generated and arrayed, providing an eightfold coverage of the 
nuclear genome. (Individual BAC clones or the entire library 
can be obtained from the Clomson University Genomics Insti- 
tute: wvAv.genome.clemson.edu). JGI has sequenced both ends 
Of all clones in (his library, and this information is available and 
can be searched using BLAST on the JGI Web site (bahama.jgi 
-psf.org/prod/bin/clilaniy/homc.chlamy.cgi). More than 2,500 
of these clones, focusing on those containing mapped molec- 
ular markers, have been fingerprinted and placed into over- 
lapping BAC contigs, Tlie BAC contigs now cover more than 
25% of the genome. As the assembly of the nuclear genome 
proceeds, by linking together sequence scaffolds, it will be 
increasingly useful (o compare BAC end sequences with the 
genomic sequence (o place additional BACs onto the physical/ 
genetic map. Ultimately, a tiling path of BAC clones corre- 
sponding lo the complete C. mnlwdlii genetic and physical 
maps will be generated. 

The information already available has made it possible lo 
apply map-based cloning strategies lo the identification of mu- 
laiil alleles in C. mnhardiii, e.g., Ifl (R. Nguyen and P. Lefeb- 
vre, presented at the he 10th Inlcrnaiional Conference on tiic 



Ceil and Molecular Biology of Chlamydomonas, 2002) and 
bld2 (27). Tlie Btd2 gene was cloned by identifying overlapping 
BAC clones covering 720 kbp of genomic sequence corre- 
sponding to 4.5 cM on linkage group III. Tlie BAC clone 
containing the wild-type £U2 gene was identified by transform- 
ing individiial BAC clones into bld2 mutant cells to rescue the 
mutant phci>oiypc. 

Map-based cloning will be greatly accelerated by a high 
density of genetically mapped polymorphisms between the lab- 
oratory strain 21er and field isolate S1C5, which is very similar 
to S1D2. Sequence information already available suggests that 
the frequency of polymorphisms between the laboraloiy and 
wild-isoiale strains is surprisingly high. In a .survey of more 
than 29,000 nucleotides from the 3' UTR of 62 transcripts, 
there were 2.7 ba.sc substitutions and 0.54 insertions or dele- 
tions per 100 bases. This level of sequence polymorphism will 
allow any new mutation in a laboratoiy strain lo be mapped 
both genetically and physically, A protocol for mapping any 
new mutation by crosses lo S1C5 followed by PCR-based de- 
tection of a set of molecular markers was recently described 
(55). Once a mutation has been mapped to a genetic interval, 
more detailed fine-structure mapping may require that addi- 
tional molecular markers in the intei"val of Interest be identi- 
fied. Such markers can be easily obtained from DNA sequence 
in regions of interest by searching for microsalellile sequences 
(usually (GT)n repeat.s]. Tliousands of raicro.satclliies, dis- 
persed throughout the genome, can be converted into PCR- 
based molecular markers by designing specific oligonuclcolide 
primers for PCR amplilication of the microsaiellite-containing 
sequence, followed by identification of the different alleles by 
sizing prodtiels on gels (the different alleles will have different 
numbere of GT repeats). Rang and Fawley (52) have used this 
procedure to map microsatellite sequences in C. mnhardiii. 

ORGANELLE GENOMES 

A complete C. reinhardlii mitochondrial genome sequence is 
available (GenBank accession U038'13). lliis 15.7-kb genome 
encodes the cytochrome /; and cylochroinc oxidase apopro- 
teins, six NAD dehydrogenase subunits, a protein resembling 
reverse transcripla.ic, large and small mitochondrial rRNAs 
(fragmented), and three iRNAs (GenBank accession U038<I3). 
All other mitochondrial components are presumably encoded 
in the nuclear genome, 

Completion of the entire sequence of the chloroplasl ge- 
nome of C. reinhardlii has permitted the generation of muta- 
tions in all of the genes on that genome (except where the 
lesions are lethal) and an analysis of transcripts that emanate 
from dilfeiont genomic regions. 'Hie complete sequence has 
also enabled the production of a chloroplasl genome niicroar- 
ray that can be used for analyzing the global accumulation of 
chloroplasl transcripts under different environmental condi- 

Chloroplast genes and their expression, Tlie C. rvinhardui 
chloroplasl genome is 203.8 kbp (GenBank accession number 
BK000554) and contains ?9 genes, including 5 rRNA genes, 17 
l ibcsomal protein genes, 30 iRNAs specifying all of the amino 
acids, and 5 genes encoding the catalytic core of a eubacterial- 
lype RNA polymerase (72). Figure 3 depicts the circular ge- 
nome, its known genes, and the positions of those that have 
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B ATP Synthase Cytochrome bg/f 

■Transcnption/Transiation ■■ Other BB^RNA 

r~] Gene disrupted 

Gene disrupted (split gene) 
[ J Gene disrupted (dupiicated gene) 
I I Gene disrupted (essentiai gene) 

si gci^ome. The C reinhiinliii cliloroplasi genome nncl its genes arc shown. Those ihal have b 



n (lisrupicd are high- 



been disnipied. The genome contains a staggering number of 
small dispersed repeals (SDKs) thai mostly populate inler- 
genic regions. 

The stnicluro and gene content of ihe C. reinhardtii cliloro- 



plasi chromosome arc conventional, with a ribosomal DNA- 
containing inverted repeat separating two single copy regions. 
"When compared to llie cliloroplasi DNA (cpDNA) of land 
plants, llic C. reinhardtii genome lias a few noieworlliy fea- 
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TABLE 2, Ocncs disrupted on Ihe cliloroptast genome 
of C. reinlmnkii 



Gone or 












No 


93 






No 


95 




PSJ 


No 


117 




PSI 


No 


33 


tscA 


PSJ 


No 


40 


yep 


PSI 


No 


10 


ycf4 




No 


10 






No 


7 


psbC 


PSII 


No 


100 






No 


29 




PSIl 


No 


78 


psbfi 


PSII 


No 


85, 113 


psbl 


PSII 


No 


6} 


pshK 


PSII 


No 


Its 


pshT 


PSII 


No 


86 




PSII 


No 


115 


pelA 


Cytochrome fr,/ 


No 


&2 


pall) 


Q'lochroinc !>,/ 


No 


62 










peiG 


Cytoclironic h,/ 






pelL 


Cj'iochromc b,f 


No 


119 


MpB 


ATP synthase 
ATP syniiiase 




25 


iiipi: 


ATP syntlifiso 




96 




Heme (illRclimciil 


No 


)33 


fhlL 


Envelope ininsporlcr 


No 


101 




Chlorophyll synthesis 


No 


IH 


clilN 


Chlorophyll !i>'nlhcsis 


No 


14 


dpi' 


Protease 


Yes 


47.71 


ORF1995 


Unknown 


Yes 


11 


rbcL 


Kubisco 


No 


no 


rpoBt 


Trsnscription 


Yes 


35 


rpoB2 


TranseWpilon 


Yes 


35 


rpoC3 


Trnnscriplion 


Yes 


35 


rps3 


Tronslaiion 


Yes 


69 


" I'SI, phoiosj'Siem 1; PSM, pliglosv-slora 11. 



lures: (i) an unusual gene, wc/(, thai encodes an RNA lliai is 
involved in fraiij-splicing olpsaA iranscriptional segments; (ii) 
a split rpoCl gene; (ili) Ihe presence of tufA, whicli encodes 
elongation factor Ef-Tu; (iv) two large open reading frames 
(ORFs) ({,995 and 2,971) of unknown bul essential function; 
and (v) an absence oi ndb genes, which encode polypeptides 
critical for chlororcsplration, a process first reported in C. 
reinhardlii (6). The nclh genes arc ubiquilous on land planl 
cpDNA, 

Gene disruption is routine for C. reinhardlii ciiloroplast 
genes, and even the so-called essential genes can be function- 
ally analysed by weakening their translation initiation codons 
(71). The completion of llie genome sequence docs not offer 
many new gene candidates for functional analyses but docs 
provide landmarks necessary for gene manipulation and the 
analysis of global plastid gene expression. Table 2 lists genes 
marked in Fig. 3 as having been disrupted; Ihe total is an 
impressive 35 genes in which only 6 could not be brought to 
homoplasmiclly. 

The analysis of the chloroplast genome enables researchers 
to define previously undiscovered genes and lo measure ex- 
pression of known genes. Sequence alone does not necessarily 
presage identification of a full genomic co)nplement, and some 
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FIG. 4. SDR sequences on the chloroplRst chromosome, lltc (irsi 
100 kb of the chloropiasi chromosome were nnolyzcd for SDRs using 

shown on liio'lop' row only; llic " Wendy'' Iraii.iposoi) anil its disiililtd 
duplicate copy are sliovvn on ihc top row iii aiounci position 75,000. 
The thin gray line represents one copy of llic large inverted repeal. 
Each dot represents a repent of the sequence along Ihc lop line; c.g„ 
Wendy IS duplicated, so a second line appears undornciilh ii, I he 
SDRs are represented by the laigc nuinhcrs of dots, whose sequence 
identity lo the particular place on the genome ranges from liO to 100% 
as shown in Ihe scale on the right. 



genes (like we/4) may not encode proteins. To complicaie mat- 
ters, three of the four major photosynthetic complexes (plio- 
tosystem I, photosystcm II, and the cytochrome A,/ complex) 
contain small chloroplasl-cncoded polypeptides witli ORF 
sizes thai would frequently arise by chaticc in the genome. For 
this reason, annotation of the ORFs was limited to those at 
least 100 residues long. Since small geties or non-prolcin-en- 
coding genes should nonetheless be represenloci in Ihe tran- 
script pool, a comprehensive RNA filter blot analysis was un- 
dertaken, tising RNA isolated from cells grown under a range 
of environmental conditions. As reported by Lilly ei a). (68) the 
accumulation of chloroplast transcripts is strongly allecicd by 
culture conditions. Under conditions in which most investiga- 
tors grow their cells— In rich medium and under continuous 
light— chloroplast iranscripi acciimulation is relatively high. 
This is consistent with the observations that substantial dc- 
crea.ses in Ihe cpRNA content do nol, in the short term, visibly 
affect the synthesis of most chloroplast polypeptides (28). Uii- 
(lev conditions of abiotic slress, changes in transcript accumu- 
lation range from subtle lo as much as cighlfold. Increases in 
the levels of some transcripts in response to phosphate depri- 
vation appear lo be mediated, at least in pari, by polynucle- 
otide phosphorylase (Y. Kominc and D. Stem, unpublished 
results), a nuclear-encoded, chloroplast RNasc whose activity 
is modulated by physiologically relevant phosphate concentra- 
lions (135). 

SDRs. The SDRs that have colonized intcrgcnic regions of 
the cpDNA (Fig. 4) present a fascinating cvohilionaiy puzzle. 
Of sequenced cpDNAs within the chlorophytes, which include 
land plants as well as green algae, only Chlorella sp. appears to 
have numerous SDRs (72). Surprisingly, there is almost no 
sequence similaiily between the SDRs of Chlorella and C. 
reinhardlii, suggesting that SDR amplificalioii might share a 
common mechanism bul be sequence independent. The rela- 
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lively balanced dislr^b^nioll of SDKs in Ihe C. reiiilmrdiii chh- 
roplasl genome raises questions concerning bol)i llicir origin 
and function, Did an ancient invasion of a iransposable ele- 
meni subsequently lead to the dispersal of smaller fragments, 
or did a nuclear mutation somehow permit or foment accumu- 
latioit of SDRs? It has been suggested (IS) that short repeats 
may be associated with rearrangement of chloroplast genes or 
that ihey might function as binding sites for proteins thai 
participate in gene expression. Interestingly, SDR-rich se- 
quences upstream olpciA exhibit a conformational (torsional) 
response to light, which is correlated with increased transcrip- 
tional acliviLy (122). 

In suinmai^, chloroplast genomics in C. reinlwrdiii ]ids pro- 
vided sophisticated tools for analyzing and manipulating 
opDNA and has raised fascinating evolutionary questions. Re- 
cent years have seen accelerated cloning and analysis of nu- 
clear genes encoding chloroplast regulatory factors (97, 99), 
wliicli will stimulate studies on their interactions with chloro- 
plast mRNAs and with one another (24, 137). With the se- 
quencing of the C. reinlwrdUi nuclear genome, whole new 
families of putative regulators of chloroplast gene expression 
will emerge, presenting an opportunity to build an integrated 
image of genetic inleraetiotis between the nuclear and chloro- 
plast genomes and how they are fine-tuned by critical features 
of Ihc environment. 

TOWARD AN INTEGRATED DATABASE 

U,sc of available databases. One strength of C. icinhardiii a,s 
a model system lies in the extent to which it has been used for 
genetic and physiological characteriMlion of biological pro- 
cesses. With the advent of C, reiiihtirdiii genomics, we are 
poised to link phenotypes, alleles, and expression and se- 
quence features into an integrated database. 

The major goals of database construction arc to (i) provide 
user-friendly points of access for the sequence data, (ii) con- 
nect genomic features to the classical biology of the organism, 
(iii) provide tools for viewing and querying genomic and gene 
expression data, and (iv) generate resources and tools for 
cross-species comparisons as data from related algal species 
become available. 

Currently the genomic and organismal data are dispersed 
among three databases: (i) ChlamyDB, which contains infor- 
mation on genetic loci, mutant alleles, and sequenced genes, 
descriptions of strains, bibliographical citations, and commu- 
nity member information; (ii) ChlamyEST, which contains se- 
quence dala (EST, conligs, unigcne, chloroplast, mitochon- 
dria) and gene annotations; and (iii) the JGI Chlamydomonas 
Genome Portal (see "Chlamydomonas Genome Portal" 
above), which contains the nuclear genome sequence, gene 
model predictions, and preliminaiy annotation dala. All three 
databases are accessible through search engines, and both the 
Chlamydomonas Genome Project and the JGI Web sites in- 
clude on-line Blast utihiies, with additional specialized dalasets 
available at ChlamyEST containing sequences Srom the Volvo- 
caks (including Chlamydomonas, Volvox, Eudoiina, Pandoma, 
Dumlieth, and Haemalococciis, among otbci's) and BAG end 
sequences. 

Integration of the databases, (i) Unification of ChlamyDB 
and ChlamyEST. The near-term challenge is to link all C. 



ra;t/i<7rrf/«-rclated dala sets in a seamless manner. To this end 
we will unify data maintained in ChlamyDB and ChlamyEST 
and establish links between this unified database and the JGI 
Chlamydomonas Genome Portal. The Chlamydomonas Ge- 
nome Project is implementing a version of the Generic Model 
Oiganlsm Database (111) with the aim of integrating genetic, 
sequence, and bibliographic information. Figure 5 pvosenls a 
schematic of the proposed unifications. At the core of this 
project is the underlying "chado" database schema, designed to 
integrate the Drosophila melaiiogasier data in FlyBase into 
dislinci modular components with lightly defined dependen- 
cies ("Sequence," which conlains biological sequences and an- 
notation; "Genetics," which houses alleles and relationships 
between alleles and phenotypes; "Map," which contains any 
type of localization excluding sequence localizations; "Expres- 
sion," which depicts transcriptional events and protein expres- 
sion; "Companalysis," an adjunct to the sequence module for 
in-silieo comparisons; "CV," which applies the controlled vo- 
cabularies and ontologies; "Organism," which handles species 
and taxonomy data; "Pub," which contains bibliographic, pub- 
lications, and reference dala). As depicted in Fig. 5, dala cur- 
rently in ChlamyDB (loci, alleles, strains, phenotypes, species, 
bibliographic data, genetic dala, and physical maps) will be 
incorporated into Ihe genetics, organism, publication, and map 
modules. The sequence module will be populated by nuclear, 
chloroplast, and mitochondrial genomic sequences, EST se- 
quences and their assembled conligs, complete cDNA se- 
quences obtained from our expression libraries or from infor- 
mation in the lileralure, and DNA sequences that have been 
used to build microarrays. In addition, ihe sequence module 
maintains relationships lhat link sequence records to annota- 
tion data derived from automated resources (GcnBank, 
SwissProt, IntcrPio, GO, and SO, etc.) and more accurate 
manually curaled annotation. In the future, the expression 
module will accommodate global gene expression data derived 
from Ihe analysis of microarrays. Researchers requesting mi- 
croarrays from our facility will be asked to deposit a summary 
of their results in this module, m addition to making their data 
sets publicly accessible. 

(ii) Interconnecting ChlamyDB and the JGI databases. To 
provide a genome lhat has robust annotation and to avoid 
unnecessary duplications, ChlamyDB and the JGI will estab- 
lish inlerdatabaso links, enabling users who enter one database 
to retrieve data maintained by the other (Fig. 5). For example, 
a query of the new ChlamyDB for a particular gene or gene 
product will return as complete a response to the query as 
available and information from the JGI data set. 

DOWN THE ROAD 

Several important trends are emerging in C reinhardiii re- 
search. Analysis of mutant phenotypes (foi-ward genetics) will 
undoubtedly remain a central route for defining gene function. 
The availability of genomic sequence informalion will spur the 
development of insertional mutagenesis, and sequences of 
DNA flanking insertion sites will immediately identify putative 
genes responsible for specific phenotypes. Deliticd BAC clones 
will be used for rescuing mutant phenotypes, which will help 
eslablish gene fuiKtion. In addition, researchers will begin to 
use genetic mapping of mutations on the nuclear genome to 
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routinely clone genes; one primary goal of Ihe C. remhardiii 
genome iniliaiive is lo piovltie sels of mapping primers in a 
96-wcll format lo stimulate the use of (his approacl\. However, 
as genome sequence and annolalion become more precise, wc 
expect that reverse genetics will emerge as the centerpiece of 
functional genomics in C. reinhardiii, as il is now lor Arabidop- 
sis. This approach v/i)l exploit RNA interference and anlisensc 
RNA technologies to suppress gene expression and use lilling 
(7<), 75, 123) lo identify allelic scries for specific genes; the 
phenotypes associated with the different alleles will help elu- 
cidate the relationship between gene structure and function. 

In the very near future, global expression analyses are likely 
to lake a central position in C reinhardiii genomics. As our 
knowledge of transcribed regions in the genome becomes se- 
cure, consti uciion of a full-genome microai ray will be possible, 
enabling the synthesis of a more complete piaurc of (lie con- 
trol of gene expression. Inlegralion of the expression data will 
generate a catalog that describes (he activity of each gene and 
facilitates construction of "coregulation graphs," providing 
clues to the physiological roie of many genes of unknown 
function. Finally, microarray analyses applied to strains mu- 
tated for putative regulators will identify suites of genes subject 
to common control mechanisms. 

While analysis of transcript behavior in dynamic environ- 



ments will be one of the most rapid outcomes of whole genome 
information, many key cellular processes must be siudicd al 
the level of pioiein abundance and aciiviiy. The European 
Community is committed to building a program around C. 
reinhardiii proteoniics. Initially, the focus will bo lo identify 
components localized lo specific subcellular compartments, 
and m particular those (hat traffic to the chloropiasl and mi- 
lochoiidi ion, While no program currently available can accu- 
rately predict organcllar targeting for C. reinluirdiii, the results 
obtained by prolcomic analy.ses should generate training sels 
that stimulate Ihc development of robust predictor algorithms. 
Quan(i(alive prolcomics will also shape our understanding of 
environmental pressures that modulate levels and aciiviiies of 
specific proteins. Global analyses at both (he protein and tran- 
.scripi levels, combined with computational and informatic ap- 
proaches, will help predict functions of specific gene products 
in both metabolic and regulatory pathways and identify pro- 
moter sequences important for controlling suites of genes. 
Sequence information concerning promoter structure and 
fuiiciion can be coupled with biochemical data (8't, 90, 128) to 
determine, in a direct way, c«-acling sequences that modulate 
promoter activity. Antibodies to specific regulatory proteins 
identified in mutant screens can be used for chromatin immu- 
noprecipilaiion (80, 88, l.'^O), which would help establish spe- 
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cific proiein-DNA intcraciions. Fiinhcrmorc, iwo-hybrid {51, 
53, 67, 124) and landem-aBinily purificalion (91) raclhodolo- 
gics can be used lo explore funclional piolein-prolcin interac- 
tions. 

As with any organism, a slricfly statistical analysis of genome 
sequence properties can be used to identify general and local 
properties of the genome such as isocboros, large and small 
duplications, consensus sequences for splice junctions, and 
codon bias and its relationship to the level of expression of a 
gene or its evolutionary history, etc. Mowever, because of the 
large underlying body of genetic, gene expression, and bio- 
chemical data, wo can also predid breakthroughs in our ability 
to describe meiabohc and regulatory pathways, and identify 
novel pathways as well as tjiose thai are absent or modified in 
specific organisms. 

How C leinlwrdiii genomics Is going to evolve in the next 
few years is a qncslion for the whole community. Already, the 
developments described here have attracted new investigators 
fo the organism and invigorated established investigators, of- 
fering them a new pallet of tools lhai will undoubtedly create 
new landscapes in biological knowledge. 
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